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MODIFIED HTV ENV POLYPEPTIDES 



Technical Field 

5 The invention relates generally to modified HIV envelope (Env) polypeptides which 

are useful as immunizing agents or for generating an immune response in a subject, for 
example a cellular immune response or a protective immune response. More particularly, the 
invention relates Env polypeptides such as gpl20, gpl40 or gpl60, wherein at least one of 
the native p-sheet configurations has been modified. The invention also pertains to methods 
10 of using these polypeptides to elicit an immune response against a broad range of HIV 
subtypes. 

Background of the Invention 

The human immunodeficiency virus (HIV-1, also referred to as HTLV-III, LAV or 

15 HTLV-III/LAV) is the etiological agent of the acquired immune deficiency syndrome (AIDS) 
and related disorders, (see, e.g., Barre-Sinoussi, et al., (1983) Science 220:868-871; Gallo et 
al. (1984) Science 224:500-503; Levy et al., (1984) Science 225:840-842; Siegal et al., (1981) 
N. Engl. J. Med. 305:1439-1444). AIDS patients usually have a long asymptomatic period 
followed by the progressive degeneration of the immune system and the central nervous 

20 system. Replication of the virus is highly regulated, and both latent and lytic infection of the 
CD4 positive helper subset of T-lymphocytes occur in tissue culture (Zagury et al., (1986) 
Science 23 1 :850-853). Molecular studies of HTV-1 show that it encodes a number of genes 
(Ratner et al., (1985) Nature 313:277-284; Sanchez-Pescador et al., (1985) Science 227:484- 
492), including three structural genes - gag, pol and env - that are common to all 

25 retroviruses. Nucleotide sequences from viral genomes of other retroviruses, particularly 
HIV-2 and simian immunodeficiency viruses, S1V (previously referred to as STLV-III), also 
contain these structural genes. (Guyader et al., (1987) Nature 326:662-669; Chakrabarti et 
al.,(1987)Mi/«rc 

The envelope protein of HIV-1, HIV-2 and SIV is a glycoprotein of about 160 kd 
30 (gpl60). During virus infection of the host cell, gpl60 is cleaved by host cell proteases to 
form gpl20 and the integral membrane protein, gp41. The gp41 portion is anchored in the 
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membrane bilayer of virion, while the gpl20 segment protrudes into the surrounding 
environment. gpl20 and gp41 are more covalently associated and free gpl20 can be released 
from the surface of virions and infected cells. 

As depicted in Figure 1, crystallography studies of the gpl20 core polypeptide 
5 indicate that this polypeptide is folded into two major domains having certain emanating 
structures. The inner domain (inner with respect to the N and C terminus) features a two- 
helix, two-stranded bundle with a small five-stranded p-sandwich at its termini-proximal end 
and a projection at the distal end from which the VW2 stem emanates. The outer domain is 
a staked double barrel that lies along side the inner domain so that the outer barrel and inner 

10 bundle axes are approximately parallel. Between the distal inner domain and the distal outer 
domain is a four-stranded bridging sheet which holds a peculiar minidomain in contact with, 
but distinct from, the inner, the outer domain, and the VI /V2 domain. The bridging sheet is 
composed of four p-strand structures (p-3, P-2, p-21, P-20, shown in Figure 1). The bridging 
region can be seen in Figure 1 packing primarily over the inner domain, although some 

15 surface residues of the outer domain, such as Phe 382, reach into the bridging sheet to form 
part of its hydrophobic core. 

The basic unit of the P-sheet conformation of the bridging sheet region is the P-strand 
which exists as a less tightly coiled helix, with 2.0 residues per turn. The P-strand 
conformation is only stable when incorporated into a p-sheet, where hydrogen bonds with 

20 close to optimal geometry are formed between the peptide groups on adjacent p-strands; the 
dipole moments of the strands are also aligned favorably. Side chains from adjacent residues 
of the same strand protrude from opposite sides of the sheet and do not interact with each 
other, but have significant interactions with their backbone and with the side chains of 
neighboring strands. For a general description of p-sheets, see, e.g., T.E. Creighton, Proteins: 

25 Structures and Molecular Properties (W.H. Freeman and Company, 1993); and A.L. 
Lehninger, Biochemistry (Worth Publishers, Inc., 1975). 

The gpl20 polypeptide is instrumental in mediating entry into the host cell. Recent 
studies have indicated that binding of CD4 to gpl20 induces a conformational change in Env 
that allows for binding to a co-receptor (e.g. a chemokine receptor) and subsequent entry of 

30 the virus into the cell. (Wyatt, R., et al. (1998) Nature 393:705-71 1; Kwong, P., et al.(1998) 
Nature 393:648-659). Referring again to Figure 1, CD4 is bound into a depression formed at 
the interface of the outer domain, the inner domain and the bridging sheet of gpl20. 
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Immunogenicity of the gpl20 polypeptide has also been studied. For example, 
individuals infected by HIV-1 usually develop antibodies that can neutralize the virus in in 
vitro assays, and this response is directed primarily against linear neutralizing determinants in 
the third variable loop of gpl20 glycoprotein (Javaherian, K., et al. (1989) Proc. Natl. Acad. 
5 Sci. 86:6786-6772; Matsushita, M, et al. (1988) J. Virol. 62:2107-2144; Putney, S., et al. 
(1986) Science 234:1392-1395; Rushe, J. R., et al . (1988) Proc. Nat. Acad. Sci. USA 85: 
3 1 98-3202.). However, these antibodies generally exhibit the ability to neutralize only a 
limited number of HIV-1 strains (Matthews, T. (1986) Proc. Natl. Acad. Sci. USA. 83:9709- 
9713; Nara, P. L., et al. (1988) J. Virol. 62:2622-2628; Palker, T. J., et al. (1988) Proc. Natl. 
1 0 Acad. Sci. USA. 85: 1 932- 1 936). Later in the course of HIV infection in humans, antibodies 
capable of neutralizing a wider range of HIV-1 isolates appear (Barre-Sinoussi, F., et al. 
(1983) Science 220:868-871; Robert-Guroff, M., et al. (1985) Nature (London) 316:72-74; 
Weis, R., et al. (1985) Nature (London) 316:69-72; Weis, R., et al. (1986) Nature (London) 
324:572-575). 

1 5 Recent work done by Stamatatos et al ( 1 998) AIDS Res Hum Retroviruses 

1 4(1 3): 1 129-39, shows that a deletion of the variable region 2 from a HIV-1 SF162 virus, which 
utilizes the CCR-5 co-receptor for virus entry, rendered the virus highly susceptible to serum- 
mediated neutralization. This V2 deleted virus was also neutralized by sera obtained from 
patients infected not only with clade B HTV-1 isolates but also with clade A, C, D and F HTV- 

20 1 isolates. However, deletion of the variable region 1 had no effect. Deletion of the variable 
regions 1 and 2 from a LAI isolate HTV-I IIIB also increased the susceptibility to neutralization 
by monoclonal antibodies whose epitopes are located within the V3 loop, the CD4-binding 
site, and conserved gpl20 regions (Wyatt, R., et al. (1995) J Virol. 69:5723-5733). Rabbit 
immunogenicity studies done with the HIV-1 virus with deletions in the V1/V2 and V3 

25 region from the LAI strain, which uses the CXCR4 co-receptor for virus entry, showed no 
improvement in the ability of Env to raise neutralizing antibodies (Leu et al. (1998) AIDS 
Res. and Human Retroviruses. 14:151-155). 

Further, a subset of the broadly reactive antibodies, found in most infected 
individuals, interferes with the binding of gp 1 20 and CD4 (Kang, C.-Y., et al. (1 99 1 ) Proc. 

30 Natl. Acad. Sci. USA. 88:6171-6175; McDougal, J. S., et al. (1986)./. Immunol. 137:2937- 
2944). Other antibodies are believed to bind to the chemokine receptor binding region after 
GD4 has bound to Env (Thali et al. (1993) J. Virol. 67:3978-3988). The fact that neutralizing 
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antibodies generated during the course of HIV infection do not provide permanent antiviral 
effect may in part be due to the generation of "neutralization escapes" virus mutants and to 
the general decline in the host immune system associated with pathogenesis. In contrast, the 
presence of pre-existing neutralizing antibodies upon initial HIV-1 exposure will likely have 
5 a protective effect. 

It is widely thought that a successful vaccine should be able to induce a strong, 
broadly neutralizing antibody response against diverse HIV-1 strains (Montefiori and Evans 
(1999) AIDS Res. Hum. Ret. 15(8):689-698; Bolognesi, D.,P., et al. (1994) Ann. Int. Med. 
8:603-61 1 ; Haynes, B., F., et al. (1996) Science ;271: 324-328.). Neutralizing antibodies, by 

10 attaching to the incoming virions, can reduce or even prevent their infectivity for target cells 
and prevent the cell-to-cell spread of virus in tissue culture (Hu et al. (1992) Science 255:456- 
459; Burton, D.,R. and Montefiori, D. (1997) AIDS ll(suppl. A): 587-598). However as 
described above, antibodies directed against gpl20 do not generally exhibit broad antibody 
responses against different HIV strains. 

15 Currently, the focus of vaccine development, from the perspective of humoral 

immunity, is on the neutralization of primary isolates that utilize the CCR5 chemokine co- 
receptor believed to be important in virus entry (Zhu, T., et al. (1993) Science 261:1 179- 
1 181; Fiore, J., et al. (1994) Virology; 204:297-303). These viruses are generally much more 
resistant to antibody neutralization than T-cell line adapted strains that use the CXCR4 co- 

20 receptor, although both can be neutralized in vitro by certain broadly and potent acting 

monoclonal antibodies, such as IgGlbl2, 2G12 and 2F5 (Trkola, A., et al. (1995) J. ViroL 
69:6609-6617; D'Sousa PM., et al (1997)/. Infect Dis. 175:1062-1075). These monoclonal 
antibodies are directed to the CD4 binding site, a glycosylation site and to the gp41 fusion 
domain, respectively. The problem that remains, however, is that it is not known how to 

25 induce antibodies of the appropriate specificity by vaccination. Antibodies (Abs) elicited by 
gpl20 glycoprotein from a given isolate are usually only able to neutralize closely related 
viruses generally from similar, usually from the same, HTV-1 subtype. 

Despite the above approaches, there remains a need for Env antigens that can elicit an 
immunological response {e.g., neutralizing and/or protective antibodies) in a subject against 

30 multiple HIV strains and subtypes, for example when administered as a vaccine. The present 
invention solves these and other problems by providing modified Env polypeptides {e.g., 
gpl20) to expose epitopes in or near the CD4 binding site. 
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Summary of the Invention 

In accordance with the present invention, modified HIV Env polypeptides are 
provided. In particular, deletions and/or mutations are made in one or more of the 4-p 
antiparallel-bridging sheet in the HIV Env polypeptide. In this way, enough structure is left 
5 to allow correct folding of the polypeptide, for example of gpl20, yet enough of the bridging 
sheet is removed to expose the CD4 groove, allowing an immune response to be generated 
against epitopes in or near the CD4 binding site of the Env polypeptide (e.g., gp!20). 

In one aspect, the invention includes a polynucleotide encoding a modified HIV Env 
polypeptide wherein the polypeptide has at least one modified (e.g., deleted or replaced) 

10 amino acid residue deleted in the region corresponding to residues 421 to 436 relative to 
HXB-2, for example the constructs depicted in Figures 6-29 (SEQ ID NOs:3 to 26). In 
certain embodiments, the polynucleotide also has the region corresponding to residues 124- 
198 of the polypeptide HXB-2 (e.g., V1/V2) deleted and at least one amino acid deleted or 
replaced in the regions corresponding to the residues 1 19 to 123 and 199 to 210, relative to 

15 HXB-2. In other embodiments, these polynucleotides encode Env polypeptides having at 

least one amino acid of the small loop of the bridging sheet (e.g., amino acid residues 427 to 
429 relative to HXB-2) deleted or replaced. The amino acid sequences of the modified 
polypeptides encoded by the polynucleotides of the present invention can be based on any 
HTV variant, for example SF162. 

20 In another aspect, the invention includes immunogenic modified HIV Env 

polypeptides having at least one modified (e.g., deleted or replaced) amino acid residue 
deleted in the region corresponding to residues 421 to 436 relative to HXB-2, for example a 
deletion or replacement of one amino acids in the small loop region (e.g., amino acid residues 
427 to 429 relative to HXB-2). These polypeptides may have modifications (e.g., a deletion 

25 or a replacement) of at least one amino acid between about amino acid residue 420 and amino 
acid residue 436, relative to HXB-2 and, optionally, may have deletions or truncations of the 
VI and/or V2 regions. The immunogenic, modified polypeptides of the present invention can 
be based on any HIV variant, for example SF162. 

In another aspect, the invention includes a vaccine composition comprising any of the 

30 polynucleotides encoding modified Env polypeptides described above. Vaccine 

compositions comprising the modified Env polypeptides and, optionally, an adjuvant are also 
included in the invention. 
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In yet another aspect, the invention includes a method of inducing an immune 
response in subject comprising, administering one or more of the polynucleotides or 
constructs described above in an amount sufficient to induce an immune response in the 
subject. In certain embodiments, the method further comprises administering an adjuvant to 
5 the subject. 

In another aspect, the invention includes a method of inducing an immune response in 
a subject comprising administering a composition comprising any of the modified Env 
polypeptides described above and an adjuvant. The composition is administered in an 
amount sufficient to induce an immune response in the subject. 
10 In another aspect, the invention includes a method of inducing an immune response in 

a subject comprising 

(a) administering a first composition comprising any of the polynucleotides described 
above in a priming step and 

(b) administering a second composition comprising any of the modified Env 

15 polypeptides described above, as a booster, in an amount sufficient to induce an immune 
response in the subject. In certain embodiments, the first composition, the second 
composition or both the first and second compositions further comprise an adjuvant. 

These and other embodiments of the subject invention will readily occur to those of 
skill in the art in light of the disclosure herein. 

20 

Brief Description of the Drawings 

Figure 1 is a schematic depiction of the tertiary structure of the HIV-1 HXB _ 2 Env gpl20 
polypeptide, as determined by crystallography studies. 

Figures 2A-C depict alignment of the amino acid sequence of wild-type HIV~1 HXB _ 2 
25 Env gpl60 polypeptide (SEQ ID NO:l) with amino acid sequence of HIV variants SF162 
(shown as "162") (SEQ ID NO:2), SF2, CM236 and US4. Arrows indicate the regions that 
are deleted or replaced in the modified polypeptides. Black dots indicate conserved cysteine 
residues. The star indicates the position of the last amino acid in gpl20. 

Figures 3 A-J depict alignment of nucleotide sequences of polynucleotides encoding 
30 modified Env polypeptides having V1/V2 deletions. The unmodified amino acid residues 

encoded by these sequences correspond to wildtype SF162 residues but are numbered relative 
to HXB-2. 
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Figures 4A-M depict alignment of nucleotide sequences of polynucleotides encoding 
modified Env polypeptides having deletions or replacements in the small loop. The 
unmodified amino acid residues encoded by these sequences correspond to wildtype SF162 
residues but are numbered relative to HXB-2. 
5 Figures 5A-N depict alignment of nucleotide sequences of polynucleotides encoding 

modified Env polypeptides having both V1/V2 deletions and, in addition, deletions or 
replacements in the small loop. The unmodified amino acid residues encoded by these 
sequences correspond to wildtype SF162 residues but are numbered relative to HXB-2. 

Figure 6 depicts the nucleotide sequence of the construct designated Vall20-Ala204 

10 (SEQIDNO:3). 

Figure 7 depicts the nucleotide sequence of the construct designated Vall20-Ile201 

(SEQ ID NO:4). 

Figure 8 depicts the nucleotide sequence of the construct designated Vall20-Ile201B 
(SEQ ID NO:5). 

15 Figure 9 depicts the nucleotide sequence of the construct designated Lysl21-Val200 

(SEQ ID NO:6). 

Figure 10 depicts the nucleotide sequence of the construct designated Leul22-Serl99 
(SEQ ID NO:7). 

Figure 1 1 depicts the nucleotide sequence of the construct designated Vall20-Thr202 

20 (SEQIDNO:8). 

Figure 12 depicts the nucleotide sequence of the construct designated Trp427-Gly431 

(SEQ ID NO:9). 

Figure 13 depicts the nucleotide sequence of the construct designated Arg426-Gly43 1 
(SEQ ID NO: 10). 

25 Figure 1 4 depicts the nucleotide sequence of the construct designated Arg426- 

Gly431B(SEQIDNO:ll). 

Figure 15 depicts the nucleotide sequence of the construct designated Arg426-Lys432 

(SEQ ID NO: 12). 

Figure 16 depicts the nucleotide sequence of the construct designated Asn425-Lys432 

30 (SEQ ID NO: 13). 

Figure 1 7 depicts the nucleotide sequence of the construct designated Ile424-Ala433 

(SEQIDNO:14). 
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Figure 18 depicts the nucleotide sequence of the construct designated Ile423-Met434 
(SEQIDNO:15). 

Figure 19 depicts the nucleotide sequence of the construct designated Gln422-Tyr435 
(SEQIDNO:16). 

5 Figure 20 depicts the nucleotide sequence of the construct designated Gln422- 

Tyr435B (SEQ ID NO: 17). 

Figure 21 depicts the nucleotide sequence of the construct designated Leu 122- 
Serl99;Arg426-Gly431 (SEQ ID NO: 18). 

Figure 22 depicts the nucleotide sequence of the construct designated Leu 122- 
10 Serl99;Arg426-Lys432 (SEQ ID NO: 19). 

Figure 23 depicts the nucleotide sequence of the construct designated Leul22-Serl99; 
Trp427-Gly43 1 (SEQ ID NO:20). 

Figure 24 depicts the nucleotide sequence of the construct designated Lysl21-Val200; 

Asn425-Lys432 (SEQ ID NO:2 1 ). 
15 Figure 25 depicts the nucleotide sequence of the construct designated Vall20-Ile201; 

Ile424-Ala433 (SEQ ID NO:22). 

Figure 26 depicts the nucleotide sequence of the construct designated Vail 20- 
Ile201B; Ile424-Ala433 (SEQ ID NO:23). 

Figure 27 depicts the nucleotide sequence of the construct designated Vall20-Thr202; 
20 Ile424-Ala433 (SEQ ID NO:24). 

Figure 28 depicts the nucleotide sequence of the construct designated Vall27-Asnl95 

(SEQ ID NO:25). 

Figure 29 depicts the nucleotide sequence of the construct designated Vall27- 
Asnl95; Arg426-Gly43 1 (SEQ ID NO:26). 

25 

Detailed Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, 
conventional methods of protein chemistry, viral immunobiology, molecular biology and 
recombinant DNA techniques within the skill of the art. Such techniques are explained fully 
30 in the literature. See, e.g., T.E. Creighton, Proteins: Stru ctures and Molecular Properties 
(W.H. Freeman and Company, 1993); Nelson L.M. and Jerome H.K. HIV Protocols in 
Methods in Molecular Medicine, vol. 17, 1999; Sambrook, et al., Molecular Cloning: A 
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Laboratory Manual (Cold Spring Harbor Laboratory, 1989); F.M. Ausubel et aL Current 
Protocols in Molecular Biology , Greene Publishing Associates & Wiley Interscience New 
York; and Lipkowitz and Boyd, Reviews in Computational Chemistry , volumes 1 -present 
(Wiley- VCH, New York, New York, 1999). 
5 It must be noted that, as used in this specification and the appended claims, the 

singular forms "a", "an" and "the" include plural referents unless the content clearly dictates 
otherwise. Thus, for example, reference to "a polypeptide" includes a mixture of two or more 
polypeptides, and the like. 

10 Definitions 

In describing the present invention, the following terms will be employed, and are 

intended to be defined as indicated below. 

The terms "polypeptide," and "protein" are used interchangeably herein to denote any 

polymer of amino acid residues. The terms encompass peptides, oligopeptides, dimers, 
15 multirners, and the like. Such polypeptides can be derived from natural sources or can be 

synthesized or recombinantly produced. The terms also include postexpression modifications 

of the polypeptide, for example, glycosylation, acetylation, phosphorylation, etc. 

A polypeptide as defined herein is generally made up of the 20 natural amino acids 

Ala (A), Arg (R), Asn (N), Asp (D), Cys (C), Gin (Q), Glu (E), Gly (G), His (H), lie (I), Leu 
20 (L), Lys (K), Met (M), Phe (F), Pro (P), Ser (S), Thr (T), Trp (W), Tyr (Y) and Val (V) and 

may also include any of the several known amino acid analogs, both naturally occurring and 

synthesized analogs, such as but not limited to homoisoleucine, asaleucine, 2- 

(methylenecyclopropyl)glycine, S-methylcysteine, S-(prop-l-enyl)cysteine, homoserine, 

ornithine, norleucine, norvaline, homoarginine, 3-(3-carboxyphenyl)aIanine, 
25 cyclohexylalanine, mimosine, pipecolic acid, 4-methylglutamic acid, canavanine, 2,3- 

diaminopropionic acid, and the like. Further examples of polypeptide agents which will find 

use in the present invention are set forth below. 

By "geometry" or "tertiary structure" of a polypeptide or protein is meant the overall 

3-D configuration of the protein. As described herein, the geometry can be determined, for 
30 example, by crystallography studies or by using various programs or algorithms which 

predict the geometry based on interactions between the amino acids making up the primary 

and secondary structures. 
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By "wild type" polypeptide, polypeptide agent or polypeptide drug, is meant a 
naturally occurring polypeptide sequence, and its corresponding secondary structure. An 
"isolated" or "purified" protein or polypeptide is a protein which is separate and discrete from 
a whole organism with which the protein is normally associated in nature. It is apparent that 
5 the term denotes proteins of various levels of purity. Typically, a composition containing a 
purified protein will be one in which at least about 35%, preferably at least about 40-50%, 
more preferably, at least about 75-85%, and most preferably at least about 90% or more, of 
the total protein in the composition will be the protein in question. 

By "Env polypeptide" is meant a molecule derived from an envelope protein, 

10 preferably from HIV Env. The envelope protein of HIV-1 is a glycoprotein of about 160 kd 
(gpl60). During virus infection of the host cell, gp 160 is cleaved by host cell proteases to 
form gp 120 and the integral membrane protein, gp41. The gp41 portion is anchored in (and 
spans) the membrane bilayer of virion, while the gp 120 segment protrudes into the 
surrounding environment. As there is no covalent attachment between gpl20 and gp41, free 

15 gpl20 is released from the surface of virions and infected cells. Env polypeptides may also 
include gpl40 polypeptides. Env polypeptides can exist as monomers, dimers or multimers. 

By a "gp 120 polypeptide" is meant a molecule derived from a gpl20 region of the 
Env polypeptide. Preferably, the gpl20 polypeptide is derived from HIV Env. The primary 
amino acid sequence of gpl20 is approximately 511 amino acids, with a polypeptide core of 

20 about 60,000 dattons. The polypeptide is extensively modified by N-linked glycosylation to 
increase the apparent molecular weight of the molecule to 120,000 daltons. The amino acid 
sequence of gpl20 contains five relatively conserved domains interspersed with five 
hypervariable domains. The positions of the 18 cysteine residues in the gpl20 primary 
sequence of the HIV-1 HXB . 2 (hereinafter "HXB-2") strain, and the positions of 13 of the 

25 approximately 24 N-linked glycosylation sites in the gpl20 sequence are common to most, if 
not all, gpl20 sequences. The hypervariable domains contain extensive amino acid 
substitutions, insertions and deletions. Despite this variation, most, if not all, gp 120 
sequences preserve the virus's ability to bind to the viral receptor CD4. A "gpl20 
polypeptide" includes both single subunits or multimers. 

30 Env polypeptides {e.g., gpl20, gpl40 and gpl60) include a "bridging sheet" 

comprised of 4 anti-parallel p-strands (p-2, p-3, P-20 and p-21) that form a p-sheet. 
Extruding from one pair of the p-strands (p-2 and p-3) are two loops, VI and V2. The p-2 
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sheet occurs at approximately amino acid residue 119 (Cys) to amino acid residue 123 (Thr) 
while p-3 occurs at approximately amino acid residue 199 (Ser) to amino acid residue 201 
(He), relative to HXB-2. The "V1/V2 region" occurs at approximately amino acid positions 
126 (Cys) to residue 196 (Cys), relative to HXB-2. (see, e.g., Wyatt et ah (1995) J. Virol 
69:5723-5733; Stamatatos et al. (1998)7. Virol 72:7840-7845). Extruding from the second 
pair of p-strands (P-20 and P-21) is a "small-loop" structure, also referred to herein as "the 
bridging sheet small loop." In HXB-2, p-20 extends from about amino acid residue 422 
(Gin) to amino acid residue 426 (Met) while p-21 extends from about amino acid residue 430 
(Val) to amino acid residue 435 (Tyr). In variant SF1 62, the Met-426 is an Arg (R) residue. 
The "small loop" extends from about amino acid residue 427 (Trp) through 429 (Lys), 
relative to HXB-2. A representative diagram of gpl20 showing the bridging sheet, the small 
loop, and V1/V2 is shown in Figure 1. In addition, alignment of the amino acid sequences of 
Env polypeptide gpl60 of selected variants is shown, relative to HXB-2, in Figures 2A-C. 

Furthermore, an "Env polypeptide" or "gpl20 polypeptide" as defined herein is not 
limited to a polypeptide having the exact sequence described herein. Indeed, the HIV 
genome is in a state of constant flux and contains several variable domains which exhibit 
relatively high degrees of variability between isolates. It is readily apparent that the terms 
encompass Env (e.g.. gpl20) polypeptides from any of the identified HIV isolates, as well as 
newly identified isolates, and subtypes of these isolates. Descriptions of structural features 
are given herein with reference to HXB-2. One of ordinary skill in the art in view of the 
teachings of the present disclosure and the art can determine corresponding regions in other 
HIV variants {e.g., isolates HIV lllb , HTV SF2 , HIV-1 SF162 , HTV-1 SF170 , HIV LAV , HIV LA „ HIV MN> 
fflV-l CM23J „ HIV-1 US4 , other HIV-1 strains from diverse subtypes(e.g., subtypes, A through 
G, and O), HIV-2 strains and diverse subtypes (e.g., HIV-2 UC , and HIV-2 UC2 ), and simian 
immunodeficiency virus (SIV). (See, e.g., Virology, 3rd Edition (W.K. Joklik ed. 1988); 
Fundamental Virology, 2nd Edition (B.N. Fields and D.M. Knipe, eds. 1991); Virology, 3rd 
Edition (Fields, BN, DM Knipe, PM Howley, Editors, 1996, Lippincott-Raven, Philadelphia, 
PA; for a description of these and other related viruses), using for example, sequence 
comparison programs {e.g.. BLAST and others described herein) or identification and 
alignment of structural features {e.g., a program such as the "ALB" program described herein 
that can identify (J-sheet regions). The actual amino acid sequences of the modified Env 
polypeptides can be based on any HIV variant. 

11 
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Additionally, the term "Env polypeptide" (e.g., "gpl20 polypeptide") encompasses 
proteins which include additional modifications to the native sequence, such as additional 
internal deletions, additions and substitutions. These modifications may be deliberate, as 
through site-directed mutagenesis, or may be accidental, such as through naturally occurring 

5 mutational events. Thus, for example, if the Env polypeptide is to be used in vaccine 

compositions, the modifications must be such that immunological activity (i.e., the ability to 
elicit an antibody response to the polypeptide) is not lost. Similarly, if the polypeptides are to 
be used for diagnostic purposes, such capability must be retained. 

Thus, a "modified Env polypeptide" is an Env polypeptide (e.g.. gpl20 as defined 

10 above), which has been manipulated to delete or replace all or a part of the bridging sheet 
portion and, optionally, the variable regions VI and V2. Generally, modified Env (e.g., 
gpl20) polypeptides have enough of the bridging sheet removed to expose the CD4 binding 
site, but leave enough of the structure to allow correct folding (e.g., correct geometry). Thus, 
modifications to the p-20 and p-21 regions (between about amino acid residues 420 and 435 

15 relative to HXB-2) are preferred. Additionally, modifications to the p-2 and p-3 regions 
(between about amino acid residues 1 19 (Cys) and 201 (He)) and modifications (e.g.. 
truncations) to the VI and V2 loop regions may also be made. Although not all possible p- 
sheet and V 1 /V2 modifications have been exemplified herein, it is to be understood that other 
disrupting modifications are also encompassed by the present invention. 

20 Normally, such a modified polypeptide is capable of secretion into growth medium in 

which an organism expressing the protein is cultured. However, for purposes of the present 
invention, such polypeptides may also be recovered intracellularly. Secretion into growth 
media is readily determined using a number of detection techniques, including, e.g., 
polyacrylamide gel electrophoresis and the like, and immunological techniques such as 

25 Western blotting and immunoprecipitation assays as described in, e.g., International 
Publication No. WO 96/04301 , published February 1 5, 1996. 

A gpl20 or other Env polypeptide is produced "intracellularly'* when it is found 
within the cell, either associated with components of the cell, such as in association with the 
endoplasmic reticulum (ER) or the Golgi Apparatus, or when it is present in the soluble 

30 cellular fraction. The gp 1 20 and other Env polypeptides of the present invention may also be 
secreted into growth medium so long as sufficient amounts of the polypeptides remain 
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present within the cell such that they can be purified from cell lysates using techniques 
described herein. 

An "immunogenic" gpl20 or other Env protein is a molecule that includes at least one 
epitope such that the molecule is capable of either eliciting an immunological reaction in an 
5 individual to which the protein is administered or, in the diagnostic context, is capable of 
reacting with antibodies directed against the HIV in question. 

By "epitope" is meant a site on an antigen to which specific B cells and/or T cells 
respond, rendering the molecule including such an epitope capable of eliciting an 
immunological reaction or capable of reacting with HIV antibodies present in a biological 

10 sample. The term is also used interchangeably with "antigenic determinant" or "antigenic 
determinant site." An epitope can comprise 3 or more amino acids in a spatial conformation 
unique to the epitope. Generally, an epitope consists of at least 5 such amino acids and, more 
usually, consists of at least 8-10 such amino acids. Methods of determining spatial 
conformation of amino acids are known in the art and include, for example, x-ray 

15 crystallography and 2-dimensional nuclear magnetic resonance. Furthermore, the 

identification of epitopes in a given protein is readily accomplished using techniques well 
known in the art, such as by the use of hydrophobicity studies and by site-directed serology. 
See, also, Geysen et al, Proc. Natl Acad. Sci. USA (1984) 81:3998-4002 (general method of 
rapidly synthesizing peptides to determine the location of immunogenic epitopes in a given 

20 antigen); U.S. Patent No. 4,708,871 (procedures for identifying and chemically synthesizing 
epitopes of antigens); and Geysen et al., Molecular Immunology (1986) 23:709-715 
(technique for identifying peptides with high affinity for a given antibody). Antibodies that 
recognize the same epitope can be identified in a simple immunoassay showing the ability of 
one antibody to block the binding of another antibody to a target antigen. 

25 An "immunological response" or "immune response" as used herein is the 

development in the subject of a humoral and/or a cellular immune response to the Env {e.g., 
gpl20) polypeptide when the polypeptide is present in a vaccine composition. These 
antibodies may also neutralize infectivity, and/or mediate antibody-complement or antibody 
dependent cell cytotoxicity to provide protection to an immunized host. Immunological 

30 reactivity may be determined in standard immunoassays, such as a competition assays, well 
known in the art. 
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Techniques for determining amino acid sequence "similarity" are well known in the 
art. In general, "similarity" means the exact amino acid to amino acid comparison of two or 
more polypeptides at the appropriate place, where amino acids are identical or possess similar 
chemical and/or physical properties such as charge or hydrophobicity. A so-termed "percent 
5 similarity" then can be determined between the compared polypeptide sequences. 

Techniques for determining nucleic acid and amino acid sequence identity also are well 
known in the art and include determining the nucleotide sequence of the mRNA for that gene 
(usually via a cDNA intermediate) and determining the amino acid sequence encoded 
thereby, and comparing this to a second amino acid sequence. In general, "identity" refers to 

10 an exact nucleotide to nucleotide or amino acid to amino acid correspondence of two 
polynucleotides or polypeptide sequences, respectively. 

Two or more polynucleotide sequences can be compared by determining their 
"percent identity." Two or more amino acid sequences likewise can be compared by 
determining their "percent identity." The percent identity of two sequences, whether nucleic 

15 acid or peptide sequences, is generally described as the number of exact matches between two 
aligned sequences divided by the length of the shorter sequence and multiplied by 100. An 
approximate alignment for nucleic acid sequences is provided by the local homology 
algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981). 
This algorithm can be extended to use with peptide sequences using the scoring matrix 

20 developed by Dayhoff, Atlas of Protein Sequences and Structure, M.O. Dayhoff ed., 5 suppl 
3:353-358, National Biomedical Research Foundation, Washington, D.C., USA, and 
normalized by Gribskov, Nucl. Acids Res. 14(6):6745-6763 (1986). An implementation of 
this algorithm for nucleic acid and peptide sequences is provided by the Genetics Computer 
Group (Madison, WI) in their BestFit utility application. The default parameters for this 

25 method are described in the Wisconsin Sequence Analysis Package Program Manual, Version 
8 (1 995) (available from Genetics Computer Group, Madison, WI). Other equally suitable 
programs for calculating the percent identity or similarity between sequences are generally 
known in the art. 

For example, percent identity of a particular nucleotide sequence to a reference 
30 sequence can be determined using the homology algorithm of Smith and Waterman with a 
default scoring table and a gap penalty of six nucleotide positions. Another method of 
establishing percent identity in the context of the present invention is to use the MPSRCH 

14 
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package of programs copyrighted by the University of Edinburgh, developed by John F. 
Collins and Shane S. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View, CA). 
From this suite of packages, the Smith- Waterman algorithm can be employed where default 
parameters are used for the scoring table (for example, gap open penalty of 12, gap extension 
penalty of one, and a gap of six). From the data generated, the "Match" value reflects 
"sequence identity." Other suitable programs for calculating the percent identity or similarity 
between sequences are generally known in the art, such as the alignment program BLAST, 
which can also be used with default parameters. For example, BLASTN and BLASTP can be 
used with the following default parameters: genetic code = standard; filter = none; strand = 
both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 sequences; sort by = 
HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + PDB + GenBank 
CDS translations + Swiss protein + Spupdate + PIR. Details of these programs can be found 
at the following internet address: http://www.ncbi.nlm.gov/cgi-bin/BLAST. 

One of skill in the art can readily determine the proper search parameters to use for a 
given sequence in the above programs. For example, the search parameters may vary based 
on the size of the sequence in question. Thus, for example, a representative embodiment of 
the present invention would include an isolated polynucleotide having X contiguous 
nucleotides, wherein (i) the X contiguous nucleotides have at least about 50% identity to Y 
contiguous nucleotides derived from any of the sequences described herein, (ii) X equals Y, 
and (iii) X is greater than or equal to 6 nucleotides and up to 5000 nucleotides, preferably 
greater than or equal to 8 nucleotides and up to 5000 nucleotides, more preferably 10-12 
nucleotides and up to 5000 nucleotides, and even more preferably 15-20 nucleotides, up to 
the number of nucleotides present in the full-length sequences described herein (e.g., see the 
Sequence Listing and claims), including all integer values falling within the above-described 
ranges. 

The synthetic expression cassettes (and purified polynucleotides) of the present 
invention include related polynucleotide sequences having about 80% to 100%, greater than 
80-85%, preferably greater than 90-92%, more preferably greater than 95%, and most 
preferably greater than 98% sequence (including all integer values falling within these 
described ranges) identity to the synthetic expression cassette sequences disclosed herein (for 
example, to the claimed sequences or other sequences of the present invention) when the 
sequences of the present invention are used as the query sequence. 
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Computer programs are also available to determine the likelihood of certain 
polypeptides to form structures such as p-sheets. One such program, described herein, is the 
"ALB" program for protein and polypeptide secondary structure calculation and predication. 
In addition, secondary protein structure can be predicted from the primary amino acid 
5 sequence, for example using protein crystal structure and aligning the protein sequence 
related to the crystal structure {e.g., using Molecular Operating Environment (MOE) 
programs available from the Chemical Computing Group Inc., Montreal, P.Q., Canada). 
Other methods of predicting secondary structures are described, for example, in Gamier et al. 
(1996) Methods Enzymol. 266:540-553; Geourjon et al. (1995) Comput. Applic. Biosci. 
10 1 1:681-684; Levin (1997) Protein Eng. 10:771-776; and Rost et al. (1993) J. Molec. Biol. 
232:584-599. 

Homology can also be determined by hybridization of polynucleotides under 
conditions which form stable duplexes between homologous regions, followed by digestion 
with single-stranded-specific nuclease(s), and size determination of the digested fragments. 

1 5 Two DNA, or two polypeptide sequences are "substantially homologous" to each other when 
the sequences exhibit at least about 80%-85%, preferably at least about 90%, and most 
preferably at least about 95%-98% sequence identity over a defined length of the molecules, 
as determined using the methods above. As used herein, substantially homologous also refers 
to sequences showing complete identity to the specified DNA or polypeptide sequence. DNA 

20 sequences that are substantially homologous can be identified in a Southern hybridization 
experiment under, for example, stringent conditions, as defined for that particular system. 
Defining appropriate hybridization conditions is within the skill of the art. See, e.g., 
Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid Hybridization, supra. 

A "coding sequence" or a sequence which "encodes" a selected protein, is a nucleic 

25 acid sequence which is transcribed (in the case of DNA) and translated (in the case of 

mRNA) into a polypeptide in vitro or in vivo when placed under the control of appropriate 
regulatory sequences. The boundaries of the coding sequence are determined by a start codon 
at the 5' (amino) terminus and a translation stop codon at the 3' (carboxy) terminus. A coding 
sequence can include, but is not limited to cDNA from viral nucleotide sequences as well as 

30 synthetic and semisynthetic DNA sequences and sequences including base analogs. A 
transcription termination sequence may be located 3' to the coding sequence. 
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"Control elements" refers collectively to promoter sequences, ribosome binding sites, 
polyadenylation signals, transcription termination sequences, upstream regulatory domains, 
enhancers, and the like, which collectively provide for the transcription and translation of a 
coding sequence in a host cell. Not all of these control elements need always be present so 
5 long as the desired gene is capable of being transcribed and translated. 

A control element "directs the transcription" of a coding sequence in a cell when RNA 
polymerase will bind the promoter sequence and transcribe the coding sequence into mRNA, 
which is then translated into the polypeptide encoded by the coding sequence. 

"Operably linked" refers to an arrangement of elements wherein the components so 
10 described are configured so as to perform their usual function. Thus, control elements 

operably linked to a coding sequence are capable of effecting the expression of the coding 
sequence when RNA polymerase is present. The control elements need not be contiguous 
with the coding sequence, so long as they function to directthe expression thereof. Thus, for 
example, intervening untranslated yet transcribed sequences can be present between, e.g., a 
1 5 promoter sequence and the coding sequence and the promoter sequence can still be 
considered "operably linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its 
origin or manipulation: (1) is not associated with all or a portion of the polynucleotide with 
20 which it is associated in nature; and/or (2) is linked to a polynucleotide other than that to 
which it is linked in nature. The term "recombinant" as used with respect to a protein or 
polypeptide means a polypeptide produced by expression of a recombinant polynucleotide. 
"Recombinant host cells," "host cells," "cells," "cell lines," "cell cultures," and other such 
terms denoting procaryotic microorganisms or eucaryotic cell lines cultured as unicellular 
25 entities, are used interchangeably, and refer to cells which can be, or have been, used as 
recipients for recombinant vectors or other transfer DNA, and include the progeny of the 
original cell which has been transfected. It is understood that the progeny of a single parental 
cell may not necessarily be completely identical in morphology or in genomic or total DNA 
complement to the original parent, due to accidental or deliberate mutation. Progeny of the 
30 parental cell which are sufficiently similar to the parent to be characterized by the relevant 
property, such as the presence of a nucleotide sequence encoding a desired peptide, are 
included in the progeny intended by this definition, and are covered by the above terms. 
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By "vertebrate subject" is meant any member of the subphylum chordata, including, 
without limitation, humans and other primates, including non-human primates such as 
chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, pigs, 
goats and horses; domestic mammals such as dogs and cats; laboratory animals including 
5 rodents such as mice, rats and guinea pigs; birds, including domestic, wild and game birds 
such as chickens, turkeys and other gallinaceous birds, ducks, geese, and the like. The term 
does not denote a particular age. Thus, both adult and newborn individuals are intended to be 
covered. 

As used herein, a "biological sample" refers to a sample of tissue or fluid isolated 

10 from an individual, including but not limited to, for example, blood, plasma, serum, fecal 
matter, urine, bone marrow, bile, spinal fluid, lymph fluid, samples of the skin, external 
secretions of the skin, respiratory, intestinal, and genitourinary tracts, samples derived from 
the gastric epithelium and gastric mucosa, tears, saliva, milk, blood cells, organs, biopsies 
and also samples of in vitro cell culture constituents including but not limited to conditioned 

15 media resulting from the growth of cells and tissues in culture medium, e.g., recombinant 
cells, and cell components. 

The terms "label" and "detectable label" refer to a molecule capable of detection, 
including, but not limited to, radioactive isotopes, fluoresces, chemiluminescers, enzymes, 
enzyme substrates, enzyme cofactors, enzyme inhibitors, chromophores, dyes, metal ions, 

20 metal sols, ligands (e.g., biotin or haptens) and the like. The term "fluorescer" refers to a 
substance or a portion thereof which is capable of exhibiting fluorescence in the detectable 
range. Particular examples of labels which may be used with the invention include, but are 
not limited to fluorescein, rhodamine, dansyl, umbelliferone, Texas red, luminol, acradimum 
esters, NADPH, a-P-galactosidase, horseradish peroxidase, glucose oxidase, alkaline 

25 phosphatase and urease. 

Overview 

The present invention concerns modified Env polypeptide molecules (e.g., 
glycoprotein ("gp") 1 20). Without being bound by a particular theory, it appears that it has 
30 been difficult to generate immunological responses against Env because the CD4 binding site 
is buried between the outer domain, the inner domain and the VI /V2 domains. Thus, 
although deletion of the VW2 domain may render the virus more susceptible to 
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neutralization by monoclonal antibody directed to the CD4 site, the bridging sheet covering 
most of the CD4 binding domain may prevent an antibody response. Thus, the present 
invention provides Env polypeptides that maintain their general overall structure yet expose 
the CD4 binding domain. This allows the generation of an immune response (e.g., an 
antibody response) to epitopes in or near the CD4 binding site. 

Various forms of the different embodiments of the invention, described herein, may 
be combined. 

p-Sheet Conformations 

In the present invention, location of the p-sheet structures were identified relative to 
3-D (crystal) structure of an HXB-2 crystallized Env protein (see, Example 1 A). Based on 
this structure, constructs encoding polypeptides having replacements and or excisions which 
maintain overall geometry while exposing the CD4 binding site were designed. In particular, 
the crystal structure of HXB-2 was downloaded from the Brookhaven Database. Using the 
default parameters of the Loop Search feature of the Biopolymer module of the Sybyl 
molecular modeling package, homology and fit of amino acids which could replace the native 
loops between p-strands yet maintain overall tertiary structure were determined. Constructs 
encoding the modified Env polypeptides were then designed (Example LB,). 

Thus, the modified Env polypeptides typically have enough of the bridging sheet 
removed to expose the CD4 groove, but have enough of the structure to allow correct folding 
of the Env glycoprotein. Exemplary constructs are described below. 

Polypeptide Production 

The polypeptides of the present invention can be produced in any number of ways 
which are well known in the art. 

In one embodiment, the polypeptides are generated using recombinant techniques, 
well known in the art. In this regard, oligonucleotide probes can be devised based on the 
known sequences of the Env (e.g., gpl20) polypeptide genome and used to probe genomic or 
cDNA libraries for Env genes. The gene can then be further isolated using standard 
techniques and, e.g., restriction enzymes employed to truncate the gene at desired portions of 
the full-length sequence. Similarly, the Env gene(s) can be isolated directly from cells and 
tissues containing the same, using known techniques, such as phenol extraction and the 
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sequence further manipulated to produce the desired truncations. See, e.g.. Sambrook et aL, 
supra, for a description of techniques used to obtain and isolate DNA. 

The genes encoding the modified (e.g., truncated and/or substituted) polypeptides can 
be produced synthetically, based on the known sequences. The nucleotide sequence can be 
5 designed with the appropriate codons for the particular amino acid sequence desired. The 
complete sequence is generally assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding sequence. See, e.g., Edge (1981) 
Nature 292:756; Nambair et aL (1984) Science 223:1299; Jay et aL (1984) J. Biol. Chem. 
259:631 1; Stemmer et al. (1995) Gene 164:49-53. 

10 Recombinant techniques are readily used to clone a gene encoding an Env 

polypeptide gene which can then be mutagenized in vitro by the replacement of the 
appropriate base pair(s) to result in the codon for the desired amino acid. Such a change can 
include as little as one base pair, effecting a change in a single amino acid, or,can encompass 
several base pair changes. Alternatively, the mutations can be effected using a mismatched 

1 5 , primer which hybridizes to the parent nucleotide sequence (generally cDNA corresponding to 
the RNA sequence), at a temperature below the melting temperature of the mismatched 
duplex. The primer can be made specific by keeping primer length and base composition 
within relatively narrow limits and by keeping the mutant base centrally located. See, e.g., 
Innis et al, (1990) PCR Applications: Protocols for Functional Genomics; Zoller and Smith, 

20 Methods EnzymoL (1983) 100:468. Primer extension is effected using DNA polymerase, the 
product cloned and clones containing the mutated DNA, derived by segregation of the primer 
extended strand, selected. Selection can be accomplished using the mutant primer as a 
hybridization probe. The technique is also applicable for generating multiple point 
mutations. See, e.g., Dalbie-McFarland et al. Proc. Natl. Acad. Sci USA (1982) 79:6409. 

25 Once coding sequences for the desired proteins have been isolated or synthesized, 

they can be cloned into any suitable vector or replicon for expression. As will be apparent 
from the teachings herein, a wide variety of vectors encoding modified polypeptides can be 
generated by creating expression constructs which operably link, in various combinations, 
polynucleotides encoding Env polypeptides having deletions or mutation therein. Thus, 

30 polynucleotides encoding a particular deleted VI /V2 region can be operably linked with 
polynucleotides encoding polypeptides having deletions or replacements in the small loop 
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region and the construct introduced into a host cell for polypeptide expression. Non-limiting 
examples of such combinations are discussed in the Examples. 

Numerous cloning vectors are known to those of skill in the art, and the selection of 
an appropriate cloning vector is a matter of choice. Examples of recombinant DNA vectors 
5 for cloning and host cells which they can transform include the bacteriophage X (E. coli), 
pBR322 (E. coli), pACYC177 {E. coli), pKT230 (gram-negative bacteria), pGVl 106 
(gram-negative bacteria), pLAFRl (gram-negative bacteria), pME290 (non-£. coli 
gram-negative bacteria), pHV14 (E. coli and Bacillus subtilis), pBD9 (Bacillus), pIJ61 
(Streptomyces), pUC6 (Streptomyces), YIp5 (Saccharomyces), YCpl9 (Saccharomyces) and 

10 bovine papilloma virus (mammalian cells). See, generally, DNA Cloning: Vols. I & II, supra, 
Sambrook et al, supra; B. Perbal, supra. 

Insect cell expression systems, such as baculovirus systems, can also be used and are 
known to those of skill in the art and described in, e.g., Summers and Smith, Texas 
Agricultural Experiment Station Bulletin No. 1555 (1987). Materials and methods for 

15 baculovirus/insect cell expression systems are commercially available in kit form from, inter 
alia, Invitrogen, San Diego CA ("MaxBac" kit). 

Plant expression systems can also be used to produce the modified Env proteins. 
Generally, such systems use virus-based vectors to transfect plant cells with heterologous 
genes. For a description of such systems see, e.g., Porta et al., Mol. Biotech. (1996) 5:209- 

20 221; and Hackland et a!., Arch. Virol. (1994) 139:1-22. 

Viral systems, such as a vaccinia based infection/transfection system, as described in 
Tomei et al., J. Virol (1993) 62:4017-4026 and Selby et al., J. Gen. Virol. (1993) 
74:1 1 03-1 1 13, will also find use with the present invention. In this system, cells are first 
transfected in vitro with a vaccinia virus recombinant that encodes the bacteriophage T7 RNA 

25 polymerase. This polymerase displays exquisite specificity in that it only transcribes 

templates bearing T7 promoters. Following infection, cells are transfected with the DNA of 
interest, driven by a T7 promoter. The polymerase expressed in the cytoplasm from the 
vaccinia virus recombinant transcribes the transfected DNA into RNA which is then 
translated into protein by the host translational machinery. The method provides for high 

30 level, transient, cytoplasmic production of large quantities of RNA and its translation 
product(s). 
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The gene can be placed under the control of a promoter, ribosome binding site (for 
bacterial expression) and, optionally, an operator (collectively referred to herein as "control" 
elements), so that the DNA sequence encoding the desired Env polypeptide is transcribed into 
RNA in the host cell transformed by a vector containing this expression construction. The 
coding sequence may or may not contain a signal peptide or leader sequence. With the 
present invention, both the naturally occurring signal peptides or heterologous sequences can 
be used. Leader sequences can be removed by the host in post-translational processing. See, 
e.g., U.S. Patent Nos. 4,431,739; 4,425,437; 4,338,397. Such sequences include, but are not 
limited to, the TP A leader, as well as the honey bee mellitin signal sequence. 

Other regulatory sequences may also be desirable which allow for regulation of 
expression of the protein sequences relative to the growth of the host cell. Such regulatory 
sequences are known to those of skill in the art, and examples include those which cause the 
expression of a gene to be turned on or off in response to a chemical or physical stimulus, 
including the presence of a regulatory compound. Other types of regulatory elements may 
also be present in the vector, for example, enhancer sequences. 

The control sequences and other regulatory sequences may be ligated to the coding 
sequence prior to insertion into a vector. Alternatively, the coding sequence can be cloned 
directly into an expression vector which already contains the control sequences and an 
appropriate restriction site. 

In some cases it may be necessary to modify the coding sequence so that it may be 
attached to the control sequences with the appropriate orientation; i.e., to maintain the proper 
reading frame. Mutants or analogs may be prepared by the deletion of a portion of the 
sequence encoding the protein, by insertion of a sequence, and/or by substitution of one or 
more nucleotides within the sequence. Techniques for modifying nucleotide sequences, such 
as site-directed mutagenesis, are well known to those skilled in the art. See, e.g., Sambrook 
et al, supra; DNA Cloning, Vols. I and II, supra; Nucleic Acid Hybridization, supra. 

The expression vector is then used to transform an appropriate host cell. A number of 
mammalian cell lines are known in the art and include immortalized cell lines available from 
the American Type Culture Collection (ATCC), such as, but not limited to, Chinese hamster 
ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, monkey kidney cells 
(COS), human hepatocellular carcinoma cells (e.g., Hep G2), Vero293 cells, as well as others. 
Similarly, bacterial hosts such as E. colu Bacillus subtilis, and Streptococcus spp., will find 
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use with the present expression constructs. Yeast hosts useful in the present invention 
include inter alia, Saccharomyces cerevisiae, Candida albicans, Candida maltosa, 
Hansenula polymorpha, Kluyveromyces fragilis, Kluyveromyces lactis, Pichia guillerimondii, 
Pichia pastoris, Schizosaccharomyces pombe and Yarrowia lipolytica. Insect cells for use 
5 with baculovirus expression vectors include, inter alia, Aedes aegypti, Antographa 
californica, Bombyx mori, Drosophila melanogaster, Spodopterafrugiperda, and 
Trichoplusia ni. 

Depending on the expression system and host selected, the proteins of the present 
invention are produced by growing host cells transformed by an expression vector described 

10 above under conditions whereby the protein of interest is expressed. The selection of the 
appropriate growth conditions is within the skill of the art. 

In one embodiment, the transformed cells secrete the polypeptide product into the 
surrounding media. Certain regulatory sequences can be included in the vector to enhance 
secretion of the protein product, for example using a tissue plasminogen activator (TP A) 

15 leader sequence, a y-interferon signal sequence or other signal peptide sequences from known 
secretory proteins. The secreted polypeptide product can then be isolated by various 
techniques described herein, for example, using standard purification techniques such as but 
not limited to, hydroxyapatite resins, column chromatography, ion-exchange 
chromatography, size-exclusion chromatography, electrophoresis, HPLC, immunoadsorbent 

20 techniques, affinity chromatography, immunoprecipitation, and the like.. 

Alternatively, the transformed cells are disrupted, using chemical, physical or 
mechanical means, which lyse the cells yet keep the Env polypeptides substantially intact. 
Intracellular proteins can also be obtained by removing components from the cell wall or 
membrane, e.g., by the use of detergents or organic solvents, such that leakage of the Env 

25 polypeptides occurs. Such methods are known to those of skill in the art and are described in, 
e.g., Protein Purification Applications: A Practical Approach, (E.L.V. Harris and S.'Angal, 
Eds., 1990) 

For example, methods of disrupting cells for use with the present invention include 
but are not limited to: sonication or ultrasonication; agitation; liquid or solid extrusion; heat 
30 treatment; freeze-thaw; desiccation; explosive decompression; osmotic shock; treatment with 
lytic enzymes including proteases such as trypsin, neuraminidase and lysozyme; alkali 
treatment; and the use of detergents and solvents such as bile salts, sodium dodecylsulphate, 
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Triton, NP40 and CHAPS. The particular technique used to disrupt the cells is largely a 
matter of choice and will depend on the cell type in which the polypeptide is expressed, 
culture conditions and any pre-treatment used. 

Following disruption of the cells, cellular debris is removed, generally by 
5 centrifugation, and the intracellular^ produced Env polypeptides are further purified, using 
standard purification techniques such as but not limited to, column chromatography, ion- 
exchange chromatography, size-exclusion chromatography, electrophoresis, HPLC, 
immunoadsorbent techniques, affinity chromatography, immunoprecipitation, and the like. 
For example, one method for obtaining the intracellular Env polypeptides of the 

10 present invention involves affinity purification, such as by immunoaffinity chromatography 
using anti-Env specific antibodies, or by lectin affinity chromatography. Particularly 
preferred lectin resins are those that recognize mannose moieties such as but not limited to 
resins derived from Galanthus nivalis agglutinin (GNA), Lens culinaris agglutinin (LCA or 
lentil lectin), Pisum sativum agglutinin (PSA or pea lectin), Narcissus pseudonarcissus 

15 agglutinin (NPA) and Allium ursinum agglutinin (AUA). The choice of a suitable affinity 
resin is within the skill in the art. After affinity purification, the Env polypeptides can be 
further purified using conventional techniques well known in the art, such as by any of the 
techniques described above. 

It may be desirable to produce Env (e.g., gp!20) complexes, either with itself or other 

20 proteins. Such complexes are readily produced by e.g., co-transfecting host cells with 
constructs encoding for the Env (e.g., gpl20) and/or other polypeptides of the desired 
complex. Co-transfection can be accomplished either in trans or cis 9 i.e., by using separate 
vectors or by using a single vector which bears both of the Env and other gene. If done using 
a single vector, both genes can be driven by a single set of control elements or, alternatively, 

25 the genes can be present on the vector in individual expression cassettes, driven by individual 
control elements. Following expression, the proteins will spontaneously associate. 
Alternatively, the complexes can be formed by mixing the individual proteins together which 
have been produced separately, either in purified or semi-purified form, or even by mixing 
culture media in which host cells expressing the proteins, have been cultured. See, 

30 International Publication No. WO 96/04301 , published February 1 5, 1 996, for a description 
of such complexes. 
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Relatively small polypeptides, i.e., up to about 50 amino acids in length, can be 
conveniently synthesized chemically, for example by any of several techniques that are 
known to those skilled in the peptide art. In general, these methods employ the sequential 
addition of one or more amino acids to a growing peptide chain. Normally, either the amino 
5 or carboxyl group of the first amino acid is protected by a suitable protecting group. The 
protected or derivatized amino acid can then be either attached to an inert solid support or 
utilized in solution by adding the next amino acid in the sequence having the complementary 
(amino or carboxyl) group suitably protected, under conditions that allow for the formation of 
an amide linkage. The protecting group is then removed from the newly added amino acid 

10 residue and the next amino acid (suitably protected) is then added, and so forth. After the 
desired amino acids have been linked in the proper sequence, any remaining protecting 
groups (and any solid support, if solid phase synthesis techniques are used) are removed 
sequentially or concurrently, to render the final polypeptide. By simple modification of this 
general procedure, it is possible to add more than one amino acid at a time to a growing 

15 chain, for example, by coupling (under conditions which do not racemize chiral centers) a 
protected tripeptide with a properly protected dipeptide to form, after deprotection, a 
pentapeptide. See, e.g., J. M. Stewart and J. D. Young, Solid Phase Peptide Synthesis 
(Pierce Chemical Co., Rockford, IL 1984) and G. Barany and R. B. Merrifield, The Peptides: 
Analysis. Synthesis. Biology, editors E. Gross and J. Meienhofer, Vol. 2, (Academic Press, 

20 New York, 1980), pp. 3-254, for solid phase peptide synthesis techniques; and M. Bodansky, 
Principles of Peptide Synthesis, (Springer-Verlag, Berlin 1984) and E. Gross and J. 
Meienhofer, Eds., The Peptides: Analysis. Synthesis. Biology. Vol. 1, for classical solution 
synthesis. 

Typical protecting groups include t-butyloxycarbonyl (Boc), 9- 
25 fluorenylmethoxycarbonyl (Fmoc) benzyloxycarbonyl (Cbz); p-toluenesulfonyl (Tx); 2,4- 
dinitrophenyl; benzyl (Bzl); biphenylisopropyloxycarboxy-carbonyl, t- 
amyloxycarbonyl, isobornyloxycarbonyl, o-bromobenzyloxycarbonyl, cyclohexyl, isopropyl, 
acetyl, o-nitrophenylsulfonyl and the like. 

Typical solid supports are cross-linked polymeric supports. These can include 
30 divinylbenzene cross-linked-styrene-based polymers, for example, divinylbenzene- 

hydroxymethylstyrene copolymers, divinylbenzene-chloromethylstyrene copolymers and 
divinylbenzene-benzhydrylaminopolystyrene copolymers. 

25 
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The polypeptide analogs of the present invention can also be chemically prepared by 
other methods such as by the method of simultaneous multiple peptide synthesis. See, e.g., 
Houghten Proc. Natl Acad. ScL USA (1985) 82:5131-5135; U.S. Patent No. 4,631,21 1. 

5 Diagnostic and Vaccine Applications 

The intracellular^ produced Env polypeptides of the present invention, complexes 
thereof, or the polynucleotides coding therefor, can be used for a number of diagnostic and 
therapeutic purposes. For example, the proteins and polynucleotides or antibodies generated 
against the same, can be used in a variety of assays, to determine the presence of reactive 

10 antibodies/and or Env proteins in a biological sample to aid in the diagnosis of HIV infection 
or disease status or as measure of response to immunization. 

The presence of antibodies reactive with the Env (e.g., gpl20) polypeptides and, 
conversely, antigens reactive with antibodies generated thereto, can be detected using 
standard electrophoretic and immunodiagnostic techniques, including immunoassays such as 

15 competition, direct reaction, or sandwich type assays. Such assays include, but are not 

limited to, western blots; agglutination tests; enzyme-labeled and mediated immunoassays, 
such as ELISAs; biotin/avidin type assays; radioimmunoassays; immunoelectrophoresis; 
immunoprecipitation, etc. The reactions generally include revealing labels such as 
fluorescent, chemiluminescent, radioactive, or enzymatic labels or dye molecules, or other 

20 methods for detecting the formation of a complex between the antigen and the antibody or 
antibodies reacted therewith. 

Solid supports can be used in the assays such as nitrocellulose, in membrane or 
microtiter well form; polyvinylchloride, in sheets or microtiter wells; polystyrene latex, in 
beads or microtiter plates; polyvinylidine fluoride; diazotized paper; nylon membranes; 

25 activated beads, and the like. 

Typically, the solid support is first reacted with the biological sample (or the gpl20 
proteins), washed and then the antibodies, (or a sample suspected of containing antibodies), 
applied. After washing to remove any non-bound ligand, a secondary binder moiety is added 
under suitable binding conditions, such that the secondary binder is capable of associating 

30 selectively with the bound ligand. The presence of the secondary binder can then be detected 
using techniques well known in the art. Typically, the secondary binder will comprise an 
antibody directed against the antibody ligands. A number of anti-human immunoglobulin 
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(Ig) molecules are known in the art (e.g., commercially available goat anti-human Ig or rabbit 
anti-human Ig). Ig molecules for use herein will preferably be of the IgG or IgA type, 
however, IgM may also be appropriate in some instances. The Ig molecules can be readily 
conjugated to a detectable enzyme label, such as horseradish peroxidase, glucose oxidase, 
5 Beta-galactosidase, alkaline phosphatase and urease, among others, using methods known to 
those of skill in the art. An appropriate enzyme substrate is then used to generate a detectable 
signal 

Alternatively, a "two antibody sandwich" assay can be used to detect the proteins of 
the present invention. In this technique, the solid support is reacted first with one or more of 

10 the antibodies directed against Env (e.g., gpl20), washed and then exposed to the test sample. 
Antibodies are again added and the reaction visualized using either a direct color reaction or 
using a labeled second antibody, such as an anti-immunoglobulin labeled with horseradish 
peroxidase, alkaline phosphatase or urease. 

Assays can also be conducted in solution, such that the viral proteins and antibodies 

15 thereto form complexes under precipitating conditions. The precipitated complexes can then 
be separated from the test sample, for example, by centrifugation. The reaction mixture can 
be analyzed to determine the presence or absence of antibody-antigen complexes using any of 
a number of standard methods, such as those immunodiagnostic methods described above. 
The modified Env proteins, produced as described above, or antibodies to the 

20 proteins, can be provided in kits, with suitable instructions and other necessary reagents, in 
order to conduct immunoassays as described above. The kit can also contain, depending on 
the particular immunoassay used, suitable labels and other packaged reagents and materials 
(i.e. wash buffers and the like). Standard immunoassays, such as those described above, can 
be conducted using these kits. 

25 The Env polypeptides and polynucleotides encoding the polypeptides can also be used 

in vaccine compositions, individually or in combination, in e.g., prophylactic (i.e., to prevent 
infection) or therapeutic (to treat HIV following infection) vaccines. The vaccines can 
comprise mixtures of one or more of the modified Env proteins (or nucleotide sequences 
encoding the proteins), such as Env (e.g., gpl20) proteins derived from more than one viral 

30 isolate. The vaccine may also be administered in conjunction with other antigens and 
immunoregulatory agents, for example, immunoglobulins, cytokines, lymphokines, and 
chemokines, including but not limited to IL-2, modified IL-2 (cysl25-serl25), GM-CSF, IL- 
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12, y-interferon, IP- 10, MIPlp and RANTES. The vaccines may be administered as 
polypeptides or, alternatively, as naked nucleic acid vaccines (e.g., DNA), using viral vectors 
(e.g., retroviral vectors, adenoviral vectors, adeno-associated viral vectors) or non-viral 
vectors (e.g.. liposomes, particles coated with nucleic acid or protein). The vaccines may also 
5 comprise a mixture of protein and nucleic acid, which in turn may be delivered using the 
same or different vehicles. The vaccine may be given more than once (e.g., a "prime" 
administration followed by one or more "boosts") to achieve the desired effects. The same 
composition can be administered as the prime and as the one or more boosts. Alternatively, 
different compositions can be used for priming and boosting. 

10 The vaccines will generally include one or more "pharmaceutically acceptable 

excipients or vehicles" such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary 
substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may 
be present in such vehicles. 

A carrier is optionally present which is a molecule that does not itself induce the 

15 production of antibodies harmful to the individual receiving the composition. Suitable 
carriers are typically large, slowly metabolized macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycollic acids, polymeric amino acids, amino acid 
copolymers, lipid aggregates (such as oil droplets or liposomes), and inactive virus particles. 
Such carriers are well known to those of ordinary skill in the art. Furthermore, the Env 

20 polypeptide may be conjugated to a bacterial toxoid, such as toxoid from diphtheria, tetanus, 
cholera, etc. 

Adjuvants may also be used to enhance the effectiveness of the vaccines. Such 
adjuvants include, but are not limited to: (1) aluminum salts (alum), such as aluminum 
hydroxide, aluminum phosphate, aluminum sulfate, etc.; (2) oil-in-water emulsion 

25 formulations (with or without other specific immunostimulating agents such as muramyl 
peptides (see below) or bacterial cell wall components), such as for example (a) MF59 
(International Publication No. WO 90/14837), containing 5% Squalene, 0.5% Tween 80, and 
0.5% Span 85 (optionally containing various amounts of MTP-PE (see below), although not 
required) formulated into submicron particles using a microfluidizer such as Model 1 10Y 

30 microfluidizer (Microfluidics, Newton, MA), (b) SAF, containing 1 0% Squalane, 0.4% 
Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP (see below) either 
microfluidized into a submicron emulsion or vortexed to generate a larger particle size 
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emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immunochem, Hamilton, MT) 
containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components 
from the group consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), 
and cell wall skeleton (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, 
5 such as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particle 
generated therefrom such as ISCOMs (immunostimulating complexes); (4) Complete 
Freunds Adjuvant (CFA) and Incomplete Freunds Adjuvant (IF A); (5) cytokines, such as 
interleukins (IL-1, IL-2, etc.), macrophage colony stimulating factor (M-CSF), tumor necrosis 
factor (TNF), etc.; (6) detoxified mutants of a bacterial ADP-ribosylating toxin such as a 

10 cholera toxin (CT), a pertussis toxin (PT), or an £. coli heat-labile toxin (LT), particularly 
LT-K63 (where lysine is substituted for the wild-type amino acid at position 63) LT-R72 
(where arginine is substituted for the wild-type amino acid at position 72), CT-S109 (where 
serine is substituted for the wild-type amino acid at position 109), and PT-K9/G129 (where 
lysine is substituted for the wild-type amino acid at position 9 and glycine substituted at 

15 position 129) (see, e.g., International Publication Nos. W093/13202 and W092/19265); and 
(7) other substances that act as immunostimulating agents to enhance the effectiveness of the 
composition. 

Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine (thr-MDP), N-acteyl-normuramyl-L-alanyl-D-isogluatme (nor-MDP), N- 
20 acetylmuramyl-L-alanyl-D-isogluatminyl-L-alanine-2-(r-2 , -dipalmitoyl-5n-glycero-3- 
huydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

Typically, the vaccine compositions are prepared as injectables, either as liquid 
solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles 
prior to injection may also be prepared. The preparation also may be emulsified or 
25 encapsulated in liposomes for enhanced adjuvant effect, as discussed above. 

The vaccines will comprise a therapeutically effective amount of the modified Env 
proteins, or complexes of the proteins, or nucleotide sequences encoding the same, and any 
other of the above-mentioned components, as needed. By "therapeutically effective amount" 
is meant an amount of a modified Env {e.g., gpl20) protein which will induce a protective 
30 immunological response in the uninfected, infected or unexposed individual to which it is 
administered. Such a response will generally result in the development in the subject of a 
" secretory, cellular and/or antibody-mediated immune response to the vaccine. Usually, such 
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a response includes but is not limited to one or more of the following effects; the production 
of antibodies from any of the immunological classes, such as immunoglobulins A, D, E, G or 
M; the proliferation of B and T lymphocytes; the provision of activation, growth and 
differentiation signals to immunological cells; expansion of helper T cell, suppressor T cell, 
5 and/or cytotoxic T cell. 

Preferably, the effective amount is sufficient to bring about treatment or prevention of 
disease symptoms. The exact amount necessary will vary depending on the subject being 
treated; the age and general condition of the individual to be treated; the capacity of the 
individual's immune system to synthesize antibodies; the degree of protection desired; the 

10 severity of the condition being treated; the particular Env polypeptide selected and its mode 
of administration, among other factors. An appropriate effective amount can be readily 
determined by one of skill in the art. A "therapeutically effective amount" will fall in a 
relatively broad range that can be determined through routine trials. 

Once formulated, the nucleic acid vaccines may be accomplished with or without viral 

15 vectors, as described above, by injection using either a conventional syringe or a gene gun, 
such as the Accell® gene delivery system (PowderJect Technologies, Inc., Oxford, England). 
Delivery of DNA into cells of the epidermis is particularly preferred as this mode of 
administration provides access to skin-associated lymphoid cells and provides for a transient 
presence of DNA in the recipient. Both nucleic acids and/or peptides can be injected either 

20 subcutaneously, epidermally, intradermally, intramucosally such as nasally, rectally and 
vaginally, intraperitoneally, intravenously, orally or intramuscularly. Other modes of 
administration include oral and pulmonary administration, suppositories, needle-less 
injection, transcutaneous and transdermal applications. Dosage treatment may be a single 
dose schedule or a multiple dose schedule. Administration of nucleic acids may also be 

25 combined with administration of peptides or other substances. 

While the invention has been described in conjunction with the preferred specific 
embodiments thereof, it is to be understood that the foregoing description as well as the 
examples which follow are intended to illustrate and not limit the scope of the invention. 
Other aspects, advantages and modifications within the scope of the invention will be 

30 apparent to those skilled in the art to which the invention pertains. 
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Experimental 

Below are examples of specific embodiments for carrying out the present invention. 
The examples are offered for illustrative purposes only, and are not intended to limit the 
scope of the present invention in any way. 
5 Efforts have been made to ensure accuracy with respect to numbers used (e.g., 

amounts, temperatures, etc.), but some experimental error and deviation should, of course, be 
allowed for. 

Example 1 

10 A.l. Best-Fit and Homology Searches 

The crystal structure of HXB-2 gp 120 was downloaded from the Brookhaven 
database (COMPLEX (HIV ENVELOPE PROTEIN/CD4/FAB) 15-JUN-98 1GC1 
TITLE: HIV-1 GP120 CORE COMPLEXED WITH CD4 AND A NEUTRALIZING 
HUMAN ANTIBODY). Beta strands 3, 2, 21, and 20 of gp 120 form a sheet near the CD4 

15 binding site. Strands p-3 and p-2 are connected by the VI /V2 loop. Strands p-21 and p-20 
are connected by another small loop. The H-bonds at the interface between strands p-2 and 
p-21 are the only connection between domains of the "lower" half of the protein (joining 
helix alpha 1 to the CD4 binding site). This beta sheet and these loops mask some antigens 
(e.g., antigens which may generate neutralizing antibodies) that are only exposed during the 

20 CD4 binding. 

Constructs that remove enough of the beta sheet to expose the antigens in the CD4 
binding site, but leave enough of the protein to allow correct folding were designed. 
Specifically targeted were modifications to the small loop and, optional deletion of the VW2 
loops. Three different types of constructs were designed: (1) constructs encoding 

25 polypeptides that leave the number of residues making up the entire 4-strand beta sheet intact, 
but replace one or more residues; (2) constructs that encode polypeptide having at least one 
residue of at least one beta strand excised or (3) constructs encoding polypeptides having at 
least two residues of at least one beta strand excised. Thus, a total of 6 different turns were 
needed to rejoin the ends of the strands. 

30 Initially, residues in the small loop (residues 427-430, relative to HXB-2) and 

connected beta strands (P-20 and p-21) were modified to contain Gly and Pro (common in 
beta turns). These sequences were then used as the target to match in each search. Thie 
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geometry of the target was matched to known proteins in the Brookhaven Protein Data Bank. 
In particular, 5-residue turns (including an overlapping single residue at the N-terminal, the 2 
residue target turn and 2 overlapping residues at the C-terminal) were searched in the 
databases. In other words, these modified loops add a 2 residue turn that should be able to 
5 support a geometry that will maintain the beta-sheet structure of the wild type protein. The 
calculations were performed using the default parameters in the Loop Search feature of the 
Biopolymer module of the Syby 1 molecular modeling package. In each case, the 25 best fits 
based on geometry alone were reviewed and, of those, several selected for homology and fit 
In addition, it was also determined what modifications could be made to remove most 

10 of the V 1/V2 loop (residues 124-198, relative to HXB-2) yet leave the geometry of the 

protein intact. As with the small loop, constructs were also designed which excised one or 
more residues from the 0-2 strand (residues 119-123 of HXB-2), the fJ-3 strand (residues 199- 
201 of HXB-2) or both p-2 and p-3. For these constructs, known loops were searched to 
match the geometry of a pentamer (including two remaining residues from the N-terminal 

15 side, a 2 residue turn and 1 C-terminal residue). For these searches, Gly-Gly was preferred as 
the insert along with at least one C-terminal substitution. 

A.2. Small Loon Replacements 

In one aspect, the native sequence was replaced with residues that expose the CD4 
20 binding site, but leave the overall geometry of the protein relatively unchanged. For the 
small loop replacements, the target to match was: ASN425-MET426-GLY427-GLY428- 
GLY43 1 . Results of the search are summarized in Table 1 . 

Table 1: Search of Small Loop (Asn425 through Gly431) 

25 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit 


LYS-ASP-SER-ASN-ASN 


0.16689 


62.5 


27 


3 


TYR-GLY-LEU-GLY-LEU 


0.220308 


62.5 


28 


4 


GLU-ARG-GLU-ASP-GLY 


0.241754 


62.5 


29 


7 


ARG-LYS-GLY-GLY-ASN 


0.24881 


100 


30 


12 


TRP-THR-GLY-SER-TYR 


0.26417 


83.33 


31 
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Based on these results, constructs encoding Gly-Gly (#7), Gly-Ser (#12) or Gly-Gly- 

Asn (#7) were recommended. 

As V1/V2 and one or more residues of p-2 and p-3 are also optionally deleted in the 

modified polypeptides of the invention, known loops to match the geometry of the VI/V2 

loop were also searched. The V1/V2 loop the target to match was: Lysl21-Leu-122-Glyl23- 

Glyl24-Serl99. Some notable matches are shown in Table 2: 



Table 2: Search of V1/V2 loop (Lysl21 through Serl99) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id. No. 


Best fit 


GLN-VAL-HIS-ASP-GLU 


0.154764 


68.75 


32 


2 


LYS-GLU-GLY-ASP-LYS 


0.15718 


81.25 


33 


9 


ARG-SER-GLY-ARG-SER 


0.173731 


68.75 


34 


11 


THR-LEU-GLY-ASN-SER 


0.175554 


81.25 


35 


16 


HIS-PHE-GLY-ALA-GLY 


0.178772 


93.75 


36 



Based on these searches, constructs encoding Gly-Asn in place of VI /V2 were 
recommended. 



A.3. One Additional Residue Excisions 

For a slightly truncated small loop, one more residue was trimmed from each beta 
strand to slightly shorten the beta sheet. The target to match was: ILE424-ASN425- 
GLY426-GLY427-LYS432. Results are shown in Table 3: 



Table 3: Search of Beta sheet shortened by One residue (Ile424 through Lys432) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No. 


Best fit: 


ARG-MET-ALA-PRO-VAL 


0.316805 


58.33 


37 


Best 
horn: 


ASP-SER-ASP-GLY-PRO 


0.440896 


83.33 


38 
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Although these searches showed more variation and worse fits than the previous 
truncation, the Pro-Val or Pro-Leu encoding constructs were very similar. Accordingly, Ala- 
Pro encoding constructs were recommended. 

Sequences encoding gpl20 polypeptides having VI /V2 deleted and an additional 
5 residue from 0-2 or p-3 excised were also searched. The VW2 loop the target to match was: 
VAL120-LYS121-GLY122-GLY123-VAL200. Some notable matches are shown in Table 4. 



Table 4: Search ofVl/V2 loop (Vall20 through Val200) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-VAL-ASP-PRO-TYR 


0.400892 


58.33333 


39 


2 


SER-THR-ASN-PRO-LEU 


0.402575 


54.16667 


40 


3 


THR-ARG-SER-PRO-LEU 


0.403965 


58.33333 


41 


7 


ARG-MET-ALA-PRO-VAL 


0.440118 


58.33333 


42 



15 

The construct encoding Ala-Pro (e.g., #7) was recommended. 
A.4. Further Excisions 

In yet another truncation, an additional residue was trimmed from the 0-20 and P-21 
20 strands to further shorten the beta sheet. The target to match was ILE423-ILE424-GLY425- 
GLY426-ALA433. Notable matches are shown in Table 5. 



Table 5: Search of Beta sheet shortened by Two Residues (Ile423 through Ala433) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


THR-TYR-GLU-GLY-VAL 


0.130107 


79.16666 


43 ! 


2 


GLN-VAL-GLY-ASN-THR 


0.138245 


79.16666 


44 


3: 


THR-VAL-GLY-GLY-ILE 


0.153362 


100 


45 



A construct encoding Gly-Gly (e.g.. #3), which has 100% homology, was 
30 recommended. 
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Also searched were sequences encoding a deleted VW2 region and at least two 
residues excised from P-2, P-3 or at least one residue excised from ji-2 and P-3. The target to 
match was: CYS1 19-VAL120-GLY121-GLY122-ILE201. Notable matches are shown in 
Table 6. 

5 

Table 6: Search of V 1/V2 loop (Cysl 19 through Ile201) 



Rank 


Sequence 


RMSD 


% Homology 


Seq Id No 


Best fit: 


ASP-LEU-PRO-GLY-CYS 


0.250501 


75 


46 


4 


ASP-VAL-GLY-GLY-LEU 


0.290383 


100 


47 



10 

It was determined that both constructs would be used. 
R 1 , Constructs encoding modified Env po lypeptides 

As described above, the native loops extruding from the 4-p antiparallei-stands were 
15 excised and replaced with 1 to 3 residue turns. The loops were replaced so as to leave the 
entire P-strands or excised by trimming one or more amino acid from each side of the 
connected strands. The ends of the strands were rejoined with 

turns that preserve the same backbone geometry (e.g., tertiary structure of 0-20 and P-21), as 
determined by searching the Brookhaven Protein Data Bank. 
20 Table 7A is a summary of the truncations of the variable regions 1 and 2 

recommended for this study, as determined in Example 1 .A. above. 
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V1/V2 Modifications 


SEQ ID NO 


Figure 


-LEU122-GLY-ASN-SER199 


7 


10 


-LYS 1 2 1 -ALA-PRO-VAL200- 


6 


9 


-VAL120-GLY-GLY-ILE201 - 


4 


7 


-VAL120-PRO-GLY-ILE201B- 


| 5 


8 


- VAL 1 20-GLY-ALA-GL Y-ALA204- 


3 


6 


-VAL 1 20-GLY-GLY-ALA-THR202- 


8 


11 


- V AL 1 27-GL Y-AL A-GL Y- ASN 1 95- 


25 


28 



As previously noted, the polypeptides encoded by the constructs of the present 
invention are numbered relative to HXB-2, but the particular amino acid residue of the 
polypeptides encoded by these exemplary constructs is based on SF-162. Thus, for example, 
although amino acid residue 195 in HXB-2 is a serine (S), constructs encoding polypeptides 
having then wild type SF162 sequence will have an asparagine (N) at this position. Table 7B 
shows just three of the variations in amino acid sequence between strains HXB-2 and SF162. 
The entire sequences, including differences in residue and amino acid number, of HXB-2 and 
SF162 are shown in the alignment of Figure 2 (SEQ ID NOs:l and 2). 



Table 7B 



HXB-2 amino 
acid number 


HXB-2 Residue 


SF162 Residue/araino acid number 


128 


Serine (S) 


Thr(T)/114 


195 


Serine (S) 


Asn(N)/188 


426 


Met(M) 


Arg(R)/411 



Constructs containing deletions in the P-20 strand, p-21 stand and small loop were 
also constructed. Shown in Table 8 are constructs encoding truncations in these regions. The 
constructs in Table 8 are numbered relative to HXB-2 but the unmodified amino acid 
sequence is based on SF162. Thus, the construct encodes an arginine (Arg) as is found in 
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SF162 in the amino acid position numbered 426 relative to HXB-2 (See, also, Table 7B). 
Changes from wildtype (SF162) are shown in bold in Table 8B. 



Table 8 



Small Loop/p-20 and p-21 (Modified) ! 


SEQ ID NO 


Figure 


-TRP427-GLY-GLY43 1 - 


9 


12 


-ARG426-GLY-GLY-GLY43 1 - 


10 


13 


-ARG426-GLY-SER-GLY43 1B- 


11 


14 


-ARG426-GLY-GLY-ASN-LYS432- 


12 


15 


-ASN425-ALA-PRO-LYS432- 


13 


16 


-ILE424-GLY-GLY-ALA433- 


14 


17 


-ILE423-GLY-GLY-MET434- 


15 


18 


GLN422-GLY-GLY-TYR435- 


16 


19 


-GLN422- AL A-PRO-TYR43 5B- 


17 


20 



15 

The deletion constructs shown in Tables 7 and 8 for each one of the p-strands and 
combinations of them are constructed. These deletions will be tested in the Env forms gpl20, 
gpl40 and gpl60 from different HTV strains like subtype B strains (e.g.. SF162, US4, SF2), 
subtype E strains (e.g., CM235) and subtype C strains (e.g., AF1 10968 or AF1 10975). 
20 Exemplary constructs for SF 162 are shown in the 

Figures and are summarized in Table 9. As noted above in Figure 2 and Table 7B, in the 
bridging sheet region, the amino acid sequence of SF162 differs from HXB-2 in that the 
Met426 of HXB-2 is an Arg in SF162. In Table 9, V1/V2 refers to deletions in the V1/V2 
region; # bsm refers to a modification in the bridging sheet small loop. 

25 



30 



Table 9 


Construct 


Seq. Id. 


Fig. 


Modification/Amino acid sequence 


Vall20-Ala204 


3 


6 


VI/V2: Va]120-G1y-Ala-Gly-Ala204 


Vall20-Ile201 


4 


7 


V1/V2: Vall20-Gly-Gly-Ile201 


Vall20-llc201B 


5 


8 


V1/V2: Vall20-Pro-GIy-Ile201 


Lysl21-Val200 


6 


9 


V1/V2: Lysl21-A1a-Pro-Val200 
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Table 9 


Construct 


Seq. Id. 


Fig. 


Modification/Amino acid sequence 


Leul22-Serl99 


7 


10 


V1/V2: Leul22-Gly-Asn-Serl99 


Vall20-Thr202 ! 


8 


11 


V1/V2: Vall20-Gly-Gly-Ala-Thr202 


Trp427-Gly431 


9 


12 


bsm:Trp427-Gly-Gly431 j 


Arg426-Gly431 


10 


13 


bsm: Arg426-Gly-Gly-Gly431 


Arg426-Gly431B 


11 


14 


bsm: Arg426-G!y-Ser-Gly431 


Arg426-Lys432 


12 


15 


bsm: Arg426-Gly-Gly-Asn-Lys432 


Asn425-Lys432 


13 


16 


bsm: Asn425-Ala-Pro-Lys432 


Ile424-Ala433 


14 


17 


bsm: Ile424-Gly-Gly-Ala433 


Ile423-Met434 


15 


18 


bsm: Ile423-Gly-Gly-Met434 


Gln422-Tyr435 


16 


19 


bsm: Gln422-Gly-Gly-Tyr435 


Vall27-Asnl95 


25 


28 


bsm: Vall27-Gly-Ala-Gly-Asnl95 


Gln422-Tyr435B 


17 


20 


bsm: Gln422-Ala-r ro- 1 yr435 


Leul22-Serl99; 
Arg426-Gly431 


18 


21 


Vl/V2/bsm: Leul22-Gly-Asn-Serl99 — Arg426- 
GIy-Gly-Gly431 


Lcul22-Serl99; 
Arg426-Lys432 


19 


22 


Vl/V2/bsm: Leul22-Gly-Asn-Serl99 — Arg426- 
G1y-Gly-Asn-Lys432 


Leul22-Serl99-Trp427- 
GIy431 


20 


23 


Vl/V2/bsm: Leul22-Gly-Asn-Serl99 — Trp427- 
G!y-Gly431 


Lysl21-VaI200- 
Asn425-Lys432 


21 


24 


Vl/V2/bsm: Lys 121 -Ala-Pro- Val200 — Asn425- 
Ala-Pro-Lys432 


Vall20-lie201-Ile424- 
Ala433 


22 


25 


Vl/V2/bsm: Vall20-Gly-Gly-Ile201 — Ile424- 
Giy-Gly-Ala433 


Vall20-Ile201B-Ile424- 
Ala433 


23 


| 26 


Vl/V2ftsm: Vall20-Pro-Gly-Ile201 — Ile424- 


Vall20-Thr202;Ile424- 
Ala433 


24 


27 


Vl/V2/bsm: Va!120-Gly-Gly-Ala-Thr202 — 
Ile424-Gly-Gly-Ala433 


Vall27-Asnl95; 
Arg426-Gly431 


25 


29 


Vl/V2/bsm: Vall27-Gly-AIa-Gly-Asnl95 — 
Arg426-Gly-Gly-Gly431 



Combinations of VI /V2 deletions and bridging sheet small loop modifications in 
addition to those specifically shown in Table 9 are also within the scope of the present 
invention. Various forms of the different embodiments of the invention, described herein, 
may be combined. 
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The first screening will be done after transient expression in COS-7, RD and/or 293 
cells. The proteins that are expressed will be analyzed by immunoblot, ELISA, and for 
binding to mAbs directed to the CD4 binding site and other important epitopes on gpl20 to 
determine integrity of structure. They will also be tested in a CD4 binding assay and, in 
5 addition, the binding of neutralizing antibodies, for example using patient sera or mAb 448D 
(directed to Glu370 and Tyr384, a region of the CD4 binding groove that is not altered by the 
deletions). 

The immunogenicity of these novel Env glycoproteins will be tested in rodents and 
primates. The structures will be administered as DNA vaccines or adjuvanted protein 
10 vaccines or in combined modalities. The goal of these vaccinations will be to archive broadly 
reactive neutralizing antibody responses. 
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Claims: 



What is claimed is: 

5 1 . A polynucleotide encoding a modified HIV Env polypeptide wherein the 

polypeptide has at least one amino acid deleted or replaced in the region corresponding to 
residues 420 to 436 relative to HXB-2 (SEQ ID NO: 1). 

2. The polynucleotide of claim 1, wherein the region corresponding to residues 124- 
10 1 98 relative to HXB-2 is deleted and at least one amino acid is deleted or replaced in the 

regions corresponding to the residues 1 19 to 123 and 199 to 210 relative to HXB-2 (SEQ ID 
NO:l). 

3. The polynucleotide of claim 1, wherein at least one amino acid in the region 

15 corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO: I) is deleted or 
replaced. 

4. The polynucleotide of claim 2, wherein at least one amino acid of the in the region 
corresponding to residues 427 through 429 relative to HXB-2 (SEQ ID NO:l) is deleted or 

20 replaced. 

5. The polynucleotide of claim 1 , wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

25 6. An immunogenic modified HTV Env polypeptide having at least one amino acid 

deleted or replaced in the region corresponding to residues 420 through 436, relative to HXB- 
2 (SEQ ID NO: 1). 

7. The polypeptide of claim 6, wherein one amino acid is deleted in the region 
30 corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 
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8. The polypeptide of claim 6, wherein more than one amino acid is deleted in the 
region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 



9. The polypeptide of claim 6, wherein at least one amino acid is replaced in the 
5 region corresponding to residues 420 through 436, relative to HXB-2 (SEQ ID NO:l). 

10. The polypeptide of claim 6, wherein at least one amino acid residue between 
about amino acid residue 427 and amino acid residue 429 relative to HXB-2 (SEQ ID NO:l) 
is deleted or replaced 

10 

11. The polypeptide of claim 6, wherein the VI and V2 regions of the polypeptide are 
truncated. 

12. The polypeptide of claim 10, wherein the VI and V2 regions of the polypeptide 
15 are truncated. 

13. The polypeptide of claim 6, wherein the amino acid sequence of the modified 
HIV Env polypeptide is based on strain SF162. 

20 14. A construct comprising the nucleotide sequence depicted in Figure 6 (SEQ ID 

NO:3). 

A construct comprising the nucleotide sequence depicted in Figure 7 (SEQ ID 
A construct comprising the nucleotide sequence depicted in Figure 8 (SEQ ID 
17. A construct comprising the nucleotide sequence depicted in Figure 9 (SEQ ID 

30 NO:6). 



15. 



NO:4). 



25 



16. 



NO:5). 
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1 8. A construct comprising the nucleotide sequence depicted in Figure 1 0 (SEQ ID 

NO:7). 

19. A construct comprising the nucleotide sequence depicted in Figure 1 1 (SEQ ID 

5 NO:8). 

20. A construct comprising the nucleotide sequence depicted in Figure 12 (SEQ ID 

NO:9). 

10 21 . A construct comprising the nucleotide sequence depicted in Figure 1 3 (SEQ ID 

NO: 10). 

22. A construct comprising the nucleotide sequence depicted in Figure 14 (SEQ ID 
NO: 11). 

15 

23. A construct comprising the nucleotide sequence depicted in Figure 15 (SEQ ID 
NO:12). 

24. A construct comprising the nucleotide sequence depicted in Figure 16 (SEQ ID 
20 NO:13). 

25. A construct comprising the nucleotide sequence depicted in Figure 17 (SEQ ID 
NO: 14). 

25 26. A construct comprising the nucleotide sequence depicted in Figure 1 8 (SEQ ID 

NO: 15). 

27. A construct comprising the nucleotide sequence depicted in Figure 19 (SEQ ID 
NO: 16). 

30 

28. A construct comprising the nucleotide sequence depicted in Figure 20 (SEQ ID 
NO:17). 
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29. A construct comprising the nucleotide sequence depicted in Figure 21 (SEQ ID 
NO: 18). 

30. A construct comprising the nucleotide sequence depicted in Figure 22 (SEQ ID 
5 NO: 19). 

31. A construct comprising the nucleotide sequence depicted in Figure 23 (SEQ ID 
NO:20). 

10 32. A construct comprising the nucleotide sequence depicted in Figure 24 (SEQ ID 

NO:21). 

33. A construct comprising the nucleotide sequence depicted in Figure 25 (SEQ ID 
NO:22). 

15 

34. A construct comprising the nucleotide sequence depicted in Figure 26 (SEQ ID 
NO:23). 

35. A construct comprising the nucleotide sequence depicted in Figure 27 (SEQ ID 
20 NO:24). 

36. A construct comprising the nucleotide sequence depicted in Figure 28 (SEQ ID 
NO:25). 

25 37. A construct comprising the nucleotide sequence depicted in Figure 29 (SEQ ID 

NO:26). 

38. A vaccine composition comprising a polynucleotide encoding a modified Env 
polypeptide according to any one of claims 1-5. 

30 

39. A vaccine composition comprising a polynucleotide construct encoding a 
modified Env polypeptide according to any of claims 14-37. 
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40. A vaccine composition comprising a modified Env polypeptide according to any 
of claims 6-13. 



41. The vaccine composition of any of claims 38-40, further comprising an adjuvant. 

42. A method of inducing an immune response in subject comprising, administering a 
polynucleotide according to any one of claims 1-5 in an amount sufficient to induce an 
immune response in the subject. 

43. A method of inducing an immune response in subject comprising, administering a 
polynucleotide construct according to any one of claims 14-37 in an amount sufficient to 
induce an immune response in the subject. 

44. A method of inducing an immune response in a subject comprising administering 
a composition comprising a modified Env polypeptide according to any one of claims 6-13, 
wherein the composition is administered in an amount sufficient to induce an immune 
response in the subject 

45. The method of any of claims 42-44 further comprising administering an adjuvant 
to the subject 

46. A method of inducing an immune response in a subject comprising 

(a) administering a first composition comprising a polynucleotide according to any of 
claims 1 -5 in a priming step and 

(b) administering a second composition comprising a modified Env polypeptide 
according to any of claims 6-13, as a booster, in an amount sufficient to induce an immune 
response in the subject. 

47. The method of claim 46 wherein the first composition or second composition 
further comprise an adjuvant. 
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48. The method of claim 46 wherein the first and second compositions farther 
comprise an adjuvant. 
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SEQ H> NO:3 VAL120-ALA204 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGC 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGCCGGCGCCTGCCCCAA 

GGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTG 

CAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCC 

ACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGC 

GTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGA 

GAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCC 

CCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACA 

TCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTC 

GGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAG 

CTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAA 

CAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGA 

TCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATC 

CGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAA 

CACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGT 

ACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGC 

GTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCITCCTGGGCGCX: 

GCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAG 

CGGCATCGTGCAGCAGGAGAACAAC'CTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGC 

AGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTG 

AAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGT 

GCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGA 

TGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGC 

CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGT 

GGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCG 

GCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCT 

ACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCA 

TCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTG 

GCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTG 

ATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTAC 

TGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGA 

CGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCG 

GCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAAC 

TCGAG 
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SEQ ID NO:4 VAL120-ILE201 

GAATTCGCCACCATGGATGCAATGAAGAG 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCATCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGC 

CAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGAT 

CAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCG 

AGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAG 

CGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTG 

GGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCT 

GCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACC 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCAC 

CGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGA 

CCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAG 

GAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCA 

GCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCG 

TGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCC 

AGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCG 

AGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGG 

CCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCT 

CGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCT 

GAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCX 

TGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGC 

GCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGC 

TGTAACTCGAG 



FIG. 7 



WO 00/39303 



44 7 65 



PCT/US99/31272 



SEQ ID NO:5 VAL1 20-ILE201B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCAGTCTOG 
TTrcGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGOCA 
CCACCACCCTGTrCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTGGGCCACCC 
ACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCTGGAGAACGTGACCGAGAACITCAACA 
TGTCGAAGAACAACATGGTGGAGCAGATGCACGAGGAC 

CCTGCGTGCCCGGCATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGC 

CCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGT 

GAGCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCT 

GGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACrrCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCC 

CGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGC 

GAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

OT^CAA^C^^^^ 

TTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAAC 

GGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATG 

TA^CGC^OCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACG 

GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGC 

GCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGC 

GCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGC 

CGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGT 

GCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGG 

CATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCAT 

CTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTW 

CCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCT 

GATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGG 

ACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGC 

GGCTACAGCCCCCTGAGCrrCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCC 
AGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCIXjGTGCACGGCCTGCTGGCCCTG^ 
GGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTG^ 
CATCGTGGA^GCTGCTGGGCCGCCGCGGCTGGGAGGCCC^^ 

GATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCAC 

?SIccgcatcatcg^a^g^gVg^ 

GGCTTCGAGCGCGCCCTGCTGTAACTCGAGCGTGCT 



FIG. 8 



WO 00/39303 4 * / « PCT/US99/31272 



SEQ ID NO:6 LYS121-VAL200 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGtGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGGCCCCCGTGATCACCCA 

GGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGC 

CATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCG 

TGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGG 

CCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTG 

CAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCAT 

CACCATCGGCCGCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGC 

CCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGC 

AGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATC 

GTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAAC 

AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCG 

CATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTAGAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAGCGTGCT 



FIG. 9 



WO 00/39303 46 I 65 

SEQ ID NO:7: LEU122-SER199 



PCTAJS99/31272 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATG.CACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCITCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCX: 

CCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGC 

GGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAA 

CTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCA 

CCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTC 

CTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAG 

GCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGC 

CCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGG 

CCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTG 

ATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTG 

GAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACA 

CCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGA 

CAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTT 

CATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAA 

CCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCC 

CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGGAGCCCC 

CTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTAC 

CACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGC* 

TGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAG 

CGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAG.GGCACCGACCGCATCATCGA 

GGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGA 

GCGCGCCCTGCTGTAACTCGAGCGTGCT 



FIG. 10 



WO 00/39303 47 / 65 PCT/US99/31272 

SEQ ID NO:8 VAL120-THR202 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCT^ 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCGCCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTT^ 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCAACCGCTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGC 

CAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGAT 

CAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCG 

AGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAG 

CGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCXrTGGGCTTCCTG 

GGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCT 

GCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACX: 

TGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGC 

TACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCAC 

CGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGA 

CCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAG 

GAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCA 

GCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCXj 

TGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCC 

AGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCG 

AGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGG 

CCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCG 

CGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCXT 

GAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCC 

TGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGC 

GCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGC 

TGTAACTCG AG 



FIG. 1 1 



WO 00/39303 4 g / 65 PCT/US99/31272 

SEQ ED NO:9 TRP427-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCX: 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCT 

GGGGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATC 

ACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCXJ 

CCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGA 

AGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAG 

CGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGC 

GCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCA 

GAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCA 

TCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTG 

GGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTG 

GAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAG 

ATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAA 

GAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCA 

GCAAGTGGCTGTGGTACATCAAGATCTTCATCATC 

TCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCC 
AGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGC 
GAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGA 
CCTGCGCAGCCrGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCG 
CATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGC 
AGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCC 
GTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCA 
CATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 12 



WO 00/39303 



49 / 65 



PCT/US99/31272 



SEQ ID NO: 10 ARG426-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTG 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCGGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCITCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 13 



WO 00/39303 50 / 65 PCT/US99/31272 

SEQ ID NO:ll ARG426-GLY431B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCAGCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 5 j t 65 PCT/US99/31272 

SEQ ID NO:12 ARG426-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACITCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGGACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGC 

GGCGGCAACAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACAT 

CACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCC 

GCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTG 

AAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAA 

GCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGG 

CGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGGAGC 

AGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGC 

ATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCT 

GGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCT 

GGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTCKjGAGCGCGA 

GATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGA 

AGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATC 

AGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGC 

ATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTC 

CAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 

CGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACG 

ACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCC 

GCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTG 

CAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGC 

CGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 15 



WO 00/39303 



52 / 65 



PCTAJS99/31272 



SEQ ED NO:13 ASN425-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCrGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACGCCC 

CCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCC 

TGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGC 

GGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGA 

GCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCG 

TGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCA 

GCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAAC 

CTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCA 

GCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCT 

GGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAAC 

AAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAA 

CTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGC 

AGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGG 

CTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTC 

ACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGC 

TTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGA 

CCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAG 

CCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGA 

GCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGA 

TCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAG 

GGCACCGACCGCATCATCGAGGTGGCCCAGCGCATGGGCCGCGCCTTCCTGCACATCCCCCGC 

CGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 16 



WO 00/39303 



53 / 65 



PCT/US99/31272 



SEQ ID NO:14 ILE424-ALA433 

GAATTCGCCACCATGGATGCAATGAAGAGAG 

GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCGGCGGC 

GCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTG 

CTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGG 

CGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCC 

TGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGGGCCGTGACC 

CTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTG 

ACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCT 

GCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGC 

AGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGC 

TGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAG 

CCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACA 

CCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGA 

GCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGT 

GGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCG 

TGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCC 

CCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGC 

GACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTG 

TGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTG 

CTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCA 

GGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCA 

CCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCA 

TCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 54 / 65 PCT/US99/31272 

SEQ ID NO:15 ELE423-MET434 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCXGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCGGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCGGCGGCATG 

TACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACC 

CGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACAT 

GCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCG 

TGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGC 

GCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTG 

ACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGC 

CATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCC 

GCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGC 

GGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGA 

CCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACC 

TGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTG 

GAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACAT 

CAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAG 

CATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGAGCCGCTTCCCCGCCCC 

CCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGC 

AGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTG 

TTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGC 

CGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCT 

GAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACC 

GCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCC 

AGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 



55 / 65 



PCT/US99/31272 



SEQ ID NO:16 GLN422-TYR435 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCXGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGGGCGGCTACGCC 

CCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGAC 

GGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGA 

CAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCC 

CCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATG 

TTCCTGGGCTTCCTGGGCGCCGCCGGCAGGACCATGGGCGCCCGCAGCCTGACCCTGACCGTG 

CAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGA 

GGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGC 

TGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAG 

CTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGAT 

CTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCT 

ACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCT 

GGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGA 

TCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCG 

TGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCG 

GCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAG 

CCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAG 

CTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCG 

CGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGA 

ACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATC 

ATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGC 

TTCGAGCGCGCCCTGCTGTAACTCGAG 
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WO 00/39303 S6 ; 65 PCT/US99/31272 

SEQ ID NO:17 GLN422-TYR435B 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTG 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

ACCCTGCACTGCACCAACCTGAAGAACGCCACCAACACCAAGAGCAGCAACTGGAAGGAGAT 

GGACCGCGGCGAGATCAAGAACTGCAGCTTCAAGGTGACCACCAGCATCCGCAACAAGATGC 

AGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCAGC 

TACAAGCTGATCAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGA 

GCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAA 

GTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCC 

CCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGC 

AGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGAT 

CAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCT 

TCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAG 

AAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGAC 

CATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCG 

GCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCG 

GCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGGCCCCCTACGCCC 

CCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACG 

GCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGAC 

AACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCC 

CACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGT 

TCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGC 

AGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAG 

GCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCT 

GGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGC 

TGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATC 

TGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTA 

CACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTG 

GACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGAT 

CTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGT 

GAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGG 

CCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGC 

CCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGC 

TACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGC 

GGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAA 

CAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCAT 

CGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTT 

CGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 20 



WO 00/39303 5? , 65 PCT/US99/31272 

SEQ ID NO:18: LEU122-SER199; ARG426-GLY431 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGC^ 

GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCGGCGGCGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTrCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 21 



WO 00/39303 . „ PCT/US99/31272 

5© / 65 

SEQ ID NO:19 LEU122-SER199; ARG426-LYS432 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CnTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCC 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCGGCGGCAACAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCG 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTrCAGCTACCACCGCX: 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCITCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 22 



WO 00/39303 59 / 65 PCT/US99/31272 

SEQ ID NO: 20: LEU122-SER199; TRP427-GLY431 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGGGCAACAGCGTGAT 

CACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGG 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGA 

GCACCGTGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGC 

AGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCAT 

CATCGTGCAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCA 

AGAGCATCACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCC 

GCCAGGCCCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACC 

AAGCTGCAGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGAC(X 

CGAGATCGTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCT 

GTTCAACAGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGC 

CCTGCCGCATCAAGCAGATCATCAACCGCTGGGGCGGCAAGGCCATGTACGCCCCCCCCATCC 

GCGGCCAGATCCGCTGCAGCAGCAACATCACCGGCCTGCTGCTGACCGGCGACGGCGGCAAG 

GAGATCAGCAACACCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CAGCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGG 

CCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGC 

TTCCTGGGCGCCGCCGGCAGCACCATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGC 

CAGCTGCTGAGCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGG 

AGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGC 

ACCACCGCCGTGCCCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAA 

CATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGA 

TCGAGGAGAGCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTG 

GGCCAGCCTGTGGAACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCAT 

GATCGTGGGCGGCCTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGT 

GCGCCAGGGCTACAGCCCCCTGAGCTTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCX5 

CCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGC 

ACGGCCTGCTGGCCCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCC 

TGCGCGACCTGATCCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGG 

CCCTGAAGTACTGGGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTG 

AGCCTGTTCGACGCCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCC 

CAGCGCATCGGCCGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCC 

CTGCTGTAACTCGAG 



FIG. 23 



wrt nnnaiM PCT/US99/31272 
WO 00/39303 60 / 65 

SEQ ID NO:21 LYS121-VAL200; ASN425-LYS432 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGGCCCCCGTGATCACCCA 

GGCCTGCCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGC 

CATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCG 

TGCAGTGCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGG 

CCGAGGAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTG 

CAGCTGAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCAT 

CACCATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGC 

CCACTGCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGC 

AGGCCCAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATC 

GTGATGCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAAC 

AGCACCTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCG 

CATCAAGCAGATCATCAACGCCCCCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCG 

CTGCAGCAGCAACATCACCGGCCTGCTGCTGACGCGCGACGGCGGCAAGGAGATCAGCAACA 

CCACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTAC 

aagtacaaggtggtgaagatcgagcccctgggcgtggcccccaccaaggccaagcgccgcgt 
ggtgcagcgcgagaagcgcgccgtgaccctgggcgccatgttcctgggcttcctgggcgccgc 
cggcagcaccatgggcgcccgcagcctgaccctgaccgtgcaggcccgccagctgctgagcg 

GCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAG 

CTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAA 

GGACCAGCAGCTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGC 

CCTGGAACGCCAGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATG 

GAGTGGGAGCGCGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCA 

GAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGG 

AACTGGTTCGACATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGC 

CTGGTGGGCCTGCGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTAC 

AGCCCCCTGAGCTrCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATC 

GAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGC 

CCTGATCTGGGACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGAT 

CCTGATCGCCGCCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTG 

GGGCAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACG 

CCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGC 

CGCGCCTTCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTC 

GAG 



FIG. 24 



WO 00/39303 61 / 65 PCT/US99/31272 

SEQ ID NO:22 VAL120-ILE201; BLE 424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCATCACCC4GGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGC 

AACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGAT 

CTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGG 

TGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGC 

GAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTrCCTGGGCGCCGCCGGCAGCACC 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCG 

CGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 25 



WO 00/39303 ^ . cc PCT/US99/31272 

62 / 65 

SEQ ID NO:23: VAL120-ILE201B; ILE424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGCCCGGCATCACCCAGGCCTGC 

CCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTG 

AAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTG 

CACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGG 

AGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTG 

AAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCAT 

CGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTG 

CAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCC 

AGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATG 

CACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCGAGCTGTTCAACAGCACC 

TGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAA 

GCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCA 

ACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATC 

TTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGT 

GGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCG 

AGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCA 

TGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAG 

CAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTG 

GGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGC 

TGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCA 

GCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGC 

GAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGA 

GAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACA 

TCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGC 

GCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCT 

TCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGC 

GGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGA 

CGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGC 

CCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGC 

TGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 



FIG. 26 



WO 00/39303 63 / 65 PCT/US99/31272 

SEQ ED NO:24 VAL120-THR202; ELE424-ALA433 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGGGCGGCGCCACCCAGGCCTG 

CCCCAAGGTGAGCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGT 

GCACCCACGGCATCCGCCCCGTGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAG 

GAGGGCGTGGTGATCCGCAGCGAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCT 

GAAGGAGAGCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCA 

TCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACT 

GCAACATCAGCGGCGAGAAGTGGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCC 

CAGTTCGGCAACAAGACCATCGTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGAT 

GCACAGCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCAC 

CTGGAACAACACCATCGGCCCCAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCA 

AGCAGATCATCGGCGGCGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGC 

AACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGGAACACCACCGAGAT 

CTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGG 

TGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGC 

GAGAAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACC 

ATGGGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGGA 

GCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGT 

GGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAG 

CTGCTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCC 

AGCTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCG 

CGAGATCGACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGG 

AGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGAC 

ATCAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTG 

CGCATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGC 

TTCCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGG 

CGGCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGG 

ACGACCTGCGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCG 

CCCGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTG 

CTGCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATC 

GCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCT 

GCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ H> NO:25 VAL127-ASN195 

GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCTTCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTT 

CAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGC 

GAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAA 

CTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGITCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGG 

CGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCC 

CAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCTGGC 

AGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAAC 

ATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTT 

CCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGG 

TGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAG 

AAGCGCGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATG 

GGCGCCCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCA 

GCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGG 

GCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTG 

CTGGGCATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAG 

CTGGAGCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCG 

AGATCGACAACTACACCAACCTGATCTACAGCCTGATCGAGGAGAGCCAGAACCAGCAGGAG 

AAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACAT 

CAGCAAGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCG 

CATCGTGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTT 

CCAGACCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCG 

GCGAGCGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGAC 

GACCTGCGCAGCCTGTGCCTGrrCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCC 

CGCATCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCT 

GCAGTACTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCG 

CCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGC 

ACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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SEQ ID NO:26 VAL127-ASN195; ARG426-GLY431 



GAATTCGCCACCATGGATGCAATGAAGAGAGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCA 

GTCITCGTTTCGCCCAGCGCCGTGGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTG 

TGGAAGGAGGCCACCACCACCCTGTTCTGCGCCAGCGACGCCAAGGCCTACGACACCGAGGT 

GCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCT 

GGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGAG 

GACATCATCAGCCTGTGGGACCAGAGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTG 

GGGGCAGGGAACTGCAACACCAGCGTGATCACCCAGGCCTGCCCCAAGGTGAGCTTCGAGCC 

CATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTT 

CAACGGCAGCGGCCCCTGCACCAACGTGAGCACCGTGCAGTGCACCCACGGCATCCGCCCCG 

TGGTGAGCACCCAGCTGCTGCTGAACGGCAGCCTGGCCGAGGAGGGCGTGGTGATCCGCAGC 

GAGAACTTCACCGACAACGCCAAGACCATCATCGTGCAGCTGAAGGAGAGCGTGGAGATCAA 

CTGCACCCGCCCCAACAACAACACCCGCAAGAGCATCACCATCGGCCCCGGCCGCGCCTTCTA 

CGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCAGCGGCGAGAAGT 

GGAACAACACCCTGAAGCAGATCGTGACCAAGCTGCAGGCCCAGTTCGGCAACAAGACCATC 

GTGTTCAAGCAGAGCAGCGGCGGCGACCCCGAGATCGTGATGCACAGCTTCAACTGCGGCGG 

CGAGTTCTTCTACTGCAACAGCACCCAGCTGTTCAACAGCACCTGGAACAACACCATCGGCCC 

CAACAACACCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACCGCGGCG 

GCGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCAGCAGCAACATCACC 

GGCCTGCTGCTGACCCGCGACGGCGGCAAGGAGATCAGCAACACCACCGAGATCTTCCGCCX: 

CGGGGGCGGCGACATGCGCGACAACTGGCGCAGCGAGCTGTACAAGTACAAGGTGGTGAAG 

ATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCG 

CGCCGTGACCCTGGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCAGCACCATGGGCGC 

CCGCAGCCTGACCCTGACCGTGCAGGCCCGCCAGCTGCTGAGCGGCATCGTGCAGCAGCAGA 

ACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATC 

AAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGG 

CATCTGGGGCTGCAGCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGGAACGCCAGCTGGA 

GCAACAAGAGCCTGGACCAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATC 

GACAACTACACCAACCTGATCTACACCCTGATCGAGGAGAGCCAGAACCAGCAGGAGAAGAA 

CGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCAGCCTGTGGAACTGGTTCGACATCAGCA 

AGTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCG 

TGTTCACCGTGCTGAGCATCGTGAACCGCGTGCGCCAGGGCTACAGCCCCCTGAGCTTCCAGA 

CCCGCTTCCCCGCCCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAG 

CGCGACCGCGACCGCAGCAGCCCCCTGGTGCACGGCCTGCTGGCCCTGATCTGGGACGACCTG 

CGCAGCCTGTGCCTGTTCAGCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCATC 

GTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGGGCAACCTGCTGCAGTA 

CTGGATCCAGGAGCTGAAGAACAGCGCCGTGAGCCTGTTCGACGCCATCGCCATCGCCGTGG 

CCGAGGGCACCGA CCGCATCATCGAGGTGGCCCAGCGCATCGGCCGCGCCTTCCTGCACAT(X 

CCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAACTCGAG 
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<110> Chiron Corporation 

<120> MODIFIED HIV ENV POLYPEPTIDES 

<130> 1605.100 

<140> 
<141> 

<160> 26 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 856 
<212> PRT 

<213> Human immunodeficiency virus 



<400> 1 

Met Arg Val Lys Glu Lys Tyr Gin His Leu Trp Arg Trp Gly Trp Arg 
1.5 10 15 

Trp Gly Thr Met Leu Leu Gly Met Leu Met He Cys Ser Ala Thr Glu 
20 25 30 

Lys Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala 
35 40 45 

Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu 
50 55 60 

Val His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn 
65 70 75 80 

Pro Gin Glu Val Val Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp 
85 90 95 

Lys Asn Asp Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp 
100 105 HO 

Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Ser 
115 120 125 

Leu Lys Cys Thr Asp Leu Lys Asn Asp Thr Asn Thr Asn Ser Ser Ser 
130 135 140 

Gly Arg Met He Met Glu Lys Gly Glu He Lys Asn Cys Ser Phe Asn 
145 150 155 160 

He Ser Thr Ser He Arg Gly Lys Val Gin Lys Glu Tyr Ala Phe Phe 
165 170 175 

Tyr Lys Leu Asp lie He Pro He Asp Asn Asp Thr Thr Ser Tyr Lys 
180 185 190 
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Leu Thr Ser Cys Asn Thr Ser Val 
195 200 

Ser Phe Glu Pro He Pro He His 
210 215 

He Leu Lys Cys Asn Asn Lys Thr 
225 230 

Asn Val Ser Thr Val Gin Cys Thr 
245 



He Thr Gin Ala Cys Pro Lys Val 
205 

Tyr Cys Ala Pro Ala Gly Phe Ala 
220 

Phe Asn Gly Thr Gly Pro Cys Thr 
235 240 

His Gly He Arg Pro Val Val Ser 
250 255 



Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Glu Val Val He 
260 265 270 

Arg Ser Val Asn Phe Thr Asp Asn Ala Lys Thr He He Val Gin Leu 
275 280 285 

Asn Thr Ser Val Glu He Asn Cys Thr Arg Pro Asn Asn Asn Thr Arg 
290 295 300 

Lys Arg He Arg He Gin Arg Gly Pro Gly Arg Ala Phe Val Thr He 
305 310 315 320 

Gly Lys He Gly Asn Met Arg Gin Ala His Cys Asn He Ser Arg Ala 
325 330 335 

Lys Trp Asn Asn Thr Leu Lys Gin He Ala Ser Lys Leu Arg Glu Gin 
340 345 350 

Phe Gly Asn Asn Lys Thr He He Phe Lys Gin Ser Ser Gly Gly Asp 
355 360 365 

Pro Glu He Val Thr His Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr 
370 375 380 

Cys Asn Ser Thr Gin Leu Phe Asn Ser Thr Trp Phe Asn Ser Thr Trp 
385 390 395 400 

Ser Thr Glu Gly Ser Asn Asn Thr Glu Gly Ser Asp Thr He Thr Leu 
405 410 415 

Pro Cys Arg He Lys Gin He He Asn Met Trp Gin Lys Val Gly Lys 
420 425 430 

Ala Met Tyr Ala Pro Pro lie Ser Gly Gin He Arg Cys Ser Ser Asn 
435 440 445 

He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Asn Ser Asn Asn Glu 
450 455 460 

Ser Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg 
465 470 475 480 

Ser Glu Leu Tyr Lys Tyr Lys Val Val Lys He Glu Pro Leu -Gly Val 
485 490 495 

Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg Ala 
500 505 510 
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Val Gly lie Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala Gly Ser 
515 520 525 

Thr Met Gly Ala Ala Ser Met Thr Leu Thr Val Gin Ala Arg Gin Leu 
530 535 540 

Leu Ser Gly lie Val Gin Gin Gin Asn Asn Leu Leu Arg Ala lie Glu 
545 550 555 560 

Ala Gin Gin His Leu Leu Gin Leu Thr Val Trp Gly lie Lys Gin Leu 
565 570 575 

Gin Ala Arg lie Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin Gin Leu 
580 585 590 

Leu Gly lie Trp Gly Cys Ser Gly Lys Leu lie Cys Thr Thr Ala Val 
595 600 605 

Pro Trp Asn Ala Ser Trp Ser Asn Lys Ser Leu Glu Gin lie Trp Asn 
610 615 620 

His Thr Thr Trp Met Glu Trp Asp Arg Glu lie Asn Asn Tyr Thr Ser 
625 630 635 640 

Leu lie His Ser Leu lie Glu Glu Ser Gin Asn Gin Gin Glu Lys Asn 
645 650 655 

Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn Trp 
660 665 670 

Phe Asn lie Thr Asn Trp Leu Trp Tyr lie Lys Leu Phe lie Met lie 
675 680 685 

Val Gly Gly Leu Val Gly Leu Arg lie Val Phe Ala Val Leu Ser lie 
690 695 700 

Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr His 
705 710 715 720 

Leu Pro Thr Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu Glu Glu 
725 730 735 

Gly Gly Glu Arg Asp Arg Asp Arg Ser He Arg Leu Val Asn Gly Ser 
740 745 750 

Leu Ala Leu He Trp Asp Asp Leu Arg Ser Leu Cys Leu Phe Ser Tyr 
755 * 760 765 

His Arg Leu Arg Asp Leu Leu Leu He Val Thr Arg He Val Glu Leu 
770 775 780 

Leu Gly Arg Arg Gly Trp Glu Ala Leu Lys Tyr Trp Trp Asn Leu Leu 
785 790 795 800 

Gin Tyr Trp Ser Gin Glu Leu Lys Asn Ser Ala Val Ser Leu Leu Asn 
805 810 815 

Ala Thr Ala He Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val 
820 825 830 
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Val Gin Gly Ala Cys Arg Ala He Arg His He Pro Arg Arg He Arg 

835 840 845 

Gin Gly Leu Glu Arg He Leu Leu 

850 855 



<210> 2 
<211> 847 
<212> PRT 

<213> Human immunodeficiency virus 
<400> 2 

Met Arg Val Lys Gly He Arg Lys Asn Tyr Gin His Leu Trp Arg Gly 
1 5 10 15 

Gly Thr Leu Leu Leu Gly Met Leu Met He Cys Ser Ala Val Glu Lys 
20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 
35 40 45 

Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 
50 55 60 

His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 80 

Gin Glu He Val Leu Glu Asn Val Thr Glu Asn Phe Asn Met Trp Lys 
85 90 95 

Asn Asn Met Val Glu Gin Met His Glu Asp He He Ser Leu Trp Asp 
100 105 HO 

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 

His Cys Thr Asn Leu Lys Asn Ala Thr Asn Thr Lys Ser Ser Asn Trp 
130 135 140 

Lys Glu Met Asp Arg Gly Glu He Lys Asn Cys Ser Phe Lys Val Thr 
145 150 155 160 

Thr Ser He Arg Asn Lys Met Gin Lys Glu tyr Ala Leu Phe Tyr Lys 
165 170 175 

Leu Asp Val Val Pro He Asp Asn Asp Asn Thr Ser Tyr Lys Leu He 
180 185 190 

Asn Cys Asn Thr Ser Val He Thr Gin Ala Cys Pro Lys Val Ser Phe 
195 200 205 

Glu Pro lie Pro lie His Tyr Cys Ala Pro Ala Gly Phe Ala He Leu 
210 215 220 

Lys Cys Asn Asp Lys Lys Phe Asn Gly Ser Gly Pro Cys Thr Asn Val 
225 230 235 240 
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Ser Thr Val Gin Cys Thr His Gly He Arg Pro Val Val Ser Thr Gin 
245 250 255 

Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu Gly Val Val He Arg Ser 
260 265 270 

Glu Asn Phe Thr Asp Asn Ala Lys Thr He He Val Gin Leu Lys Glu 
275 280 285 

Ser Val Glu lie Asn Cys Thr Arg Pro Asn Asn Asn Thr Arg Lys Ser 
290 295 300 

He Thr lie Gly Pro Gly Arg Ala Phe Tyr Ala Thr Gly Asp He He 
305 310 315 320 

Gly Asp lie Arg Gin Ala His Cys Asn lie Ser Gly Glu Lys Trp Asn 
325 330 335 

Asn Thr Leu Lys Gin He Val Thr Lys Leu Gin Ala Gin Phe Gly Asn 
340 345 350 

Lys Thr lie Val Phe Lys Gin Ser Ser Gly Gly Asp Pro Glu He Val 
355 360 365 

Met His Ser Phe Asn Cys Gly Gly Glu Phe Phe Tyr Cys Asn Ser Thr 
370 375 380 

Gin Leu Phe Asn Ser Thr Trp Asn Asn Thr He Gly Pro Asn Asn Thr 
385 390 395 400 

Asn Gly Thr lie Thr Leu Pro Cys Arg He Lys Gin lie lie Asn Arg 
405 410 415 

Trp Gin Glu Val Gly Lys Ala Met Tyr Ala Pro Pro lie Arg Gly Gin 
420 425 430 

He Arg Cys Ser Ser Asn He Thr Gly Leu Leu Leu Thr Arg Asp Gly 
435 440 445 

Gly Lys Glu He Ser Asn Thr Thr Glu lie Phe Arg Pro Gly Gly Gly 
450 455 460 

Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val 
465 470 475 480 

Lys lie Glu Pro Leu Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val 
485 490 495 

Val Gin- Arg Glu Lys Arg Ala Val Thr Leu Gly Ala Met Phe Leu Gly 
500 505 510 

Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala Arg Ser Leu Thr Leu 
515 520 525 

Thr Val Gin Ala Arg Gin Leu Leu Ser Gly He Val Gin Gin Gin Asn 
530 535 540 

Asn Leu Leu Arg Ala He Glu Ala Gin Gin His Leu Leu Gin Leu. Thr 
545 550 555 560 
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Val Trp Gly lie Lys Gin Leu Gin Ala Arg Val Leu Ala Val Glu Arg 
565 570 575 

Tyr Leu Lys Asp Gin Gin Leu Leu Gly lie Trp Gly Cys Ser Gly Lys 
580 585 590 

Leu lie Cys Thr Thr Ala Val Pro Trp Asn Ala Ser Trp Ser Asn Lys 
595 600 605 

Ser Leu Asp Gin lie Trp Asn Asn Met Thr Trp Met Glu Trp Glu Arg 
610 615 620 

Glu He Asp Asn Tyr Thr Asn Leu He Tyr Thr Leu He Glu Glu Ser 
625 630 635 640 

Gin Asn Gin Gin Glu Lys Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys 
645 650 655 

Trp Ala Ser Leu Trp Asn Trp Phe Asp He Ser Lys Trp Leu Trp Tyr 
660 665 670 

He Lys He Phe He Met He Val Gly Gly Leu Val Gly Leu Arg lie 
675 680 685 

Val Phe Thr Val Leu Ser He Val Asn Arg Val Arg Gin Gly Tyr Ser 
690 695 700 

Pro Leu Ser Phe Gin Thr Arg Phe Pro Ala Pro Arg Gly Pro Asp Arg 
705 710 715 720 

Pro Glu Gly He Glu Glu Glu Gly Gly Glu Arg Asp Arg Asp Arg Ser 
725 730 735 

Ser Pro Leu Val His Gly Leu Leu Ala Leu He Trp Asp Asp Leu Arg 
740 745 750 

Ser Leu Cys I^eu Phe Ser Tyr His Arg Leu Arg Asp Leu He Leu He 
755 760 765 

Ala Ala Arg He Val Glu Leu Leu Gly Arg Arg Gly Trp Glu Ala Leu 
770 775 780 

Lys Tyr Trp Gly Asn Leu Leu Gin Tyr Trp He Gin Glu Leu Lys Asn 
785 790 795 800 

Ser Ala Val Ser Leu Phe Asp Ala He Ala He Ala Val Ala Glu Gly 
805 810 815 

Thr Asp Arg He He Glu Val Ala Gin Arg He Gly Arg Ala Phe lieu 
820 825 830 

His He Pro Arg Arg He Arg Gin Gly Phe Glu Arg Ala Leu Leu 
835 840 845 



<210> 3 
<211> 2310 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Vall20-Ala204 
<400> 3 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 360 
ggcgcctgcc ccaaggtgag cttcgagccc atccccatcc actactgcgc ccccgccggc 420 
ttcgccatcc tgaagtgcaa cgacaagaag ttcaacggca gcggcccctg caccaacgtg 480 
agcaccgtgc agtgcaccca cggcatccgc cccgtggtga gcacccagct gctgctgaac 540 
ggcagcctgg ccgaggaggg cgtggtgatc cgcagcgaga acttcaccga caacgccaag 600 
accatcatcg tgcagctgaa ggagagcgtg gagatcaact gcacccgccc caacaacaac 660 
acccgcaaga gcatcaccat cggccccggc cgcgccttct acgccaccgg cgacatcatc 720 
ggcgacatcc gccaggccca ctgcaacatc agcggcgaga agtggaacaa caccctgaag 780 
cagatcgtga ccaagctgca ggcccagttc ggcaacaaga ccatcgtgtt caagcagagc 840 
agcggcggcg accccgagat cgtgatgcac agcttcaact gcggcggcga gttcttctac 900 
tgcaacagca cccagctgtt caacagcacc tggaacaaca ccatcggccc caacaacacc 960 
aacggcacca tcaccctgcc ctgccgcatc aagcagatca tcaaccgctg gcaggaggtg 1020 
ggcaaggcca tgtacgcccc ccccatccgc ggccagatcc gctgcagcag caacatcacc 1080 
ggcctgctgc tgacccgcga cggcggcaag gagatcagca acaccaccga gatcttccgc 1140 
cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 1200 
aagatcgagc ccctgggcgt ggcccccacc aaggccaagc gccgcgtggt gcagcgcgag 1260 
aagcgcgccg tgaccctggg cgccatgttc ctgggcttcc tgggcgccgc cggcagcacc 1320 
atgggcgccc gcagcctgac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 1380 
cagcagcaga acaacctgct gcgcgccatc gaggcccagc agcacctgct gcagctgacc 1440 
gtgtggggca tcaagcagct gcaggcccgc gtgctggccg tggagcgcta cctgaaggac 1500 
cagcagctgc tgggcatctg gggctgcagc ggcaagctga tctgcaccac cgccgtgccc 1560 
tggaacgcca gctggagcaa caagagcctg gaccagatct ggaacaacat gacctggatg 1620 
gagtgggagc gcgagatcga caactacacc aacctgatct acaccctgat cgaggagagc 1680 
cagaaccagc aggagaagaa cgagcaggag ctgctggagc tggacaagtg ggccagcctg 1740 
tggaactggt tcgacatcag caagtggctg tggtacatca agatcttcat catgatcgtg 1800 
ggcggcctgg tgggcctgcg catcgtgttc accgtgctga gcatcgtgaa ccgcgtgcgc 1860 
cagggctaca gccccctgag cttccagacc cgcttccccg ccccccgcgg ccccgaccgc 1920 
cccgagggca tcgaggagga gggcggcgag cgcgaccgcg accgcagcag ccccctggtg 1980 
cacggcctgc tggccctgat ctgggacgac ctgcgcagcc tgtgcctgtt cagctaccac 2040 
cgcctgcgcg acctgatcct gatcgccgcc cgcatcgtgg agctgctggg ccgccgcggc 2100 
tgggaggccc tgaagtactg gggcaacctg ctgcagtact ggatccagga gctgaagaac 2160 
agcgccgtga gcctgttcga cgccatcgcc atcgccgtgg ccgagggcac cgaccgcatc 2220 
atcgaggtgg cccagcgcat cggccgcgcc ttcctgcaca tcccccgccg catccgccag 2280 
ggcttcgagc gcgccctgct gtaactcgag 2310 

<210> 4 
<211> 2316 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Ile201 
<400> 4 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
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atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 

gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 

aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 

ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 

gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 

aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 

atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 

ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 

cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 

ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 

aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 

gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 

atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 1140 

ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 

gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 

cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 

agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 1380 

atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 1440 

ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 

aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 

gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 

tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 

gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 1740 

agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 

atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 

gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 

gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 

ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 

taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 

cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 

aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 

cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 

cgccagggct tcgagcgcgc cctgctgtaa ctcgag 2316 

<210> 5 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Ile201B 
<400> 5 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgcccggc 360 
atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 4 80 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac €00 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
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aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 

gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 

atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 1140 

ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 

gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 

cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 

agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 1380 

atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 1440 

ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 

aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 

gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 

tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 

gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 1740 

agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 

atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 

gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 

gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 

ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 

taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 

cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 

aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 

cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 22 80 

cgccagggct tcgagcgcgc cctgctgtaa ctcgagcgtg ct 2322 

<210> 6 
<211> 2328 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Lysl21-Val200 
<400> 6 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaaggcc 360 

cccgtgatca cccaggcctg ccccaaggtg agcttcgagc ccatccccat ccactactgc 420 

gcccccgccg gcttcgccat cctgaagtgc aacgacaaga agttcaacgg cagcggcccc 480 

tgcaccaacg tgagcaccgt gcagtgcacc cacggcatcc gccccgtggt gagcacccag 540 

ctgctgctga acggcagcct ggccgaggag ggcgtggtga tccgcagcga gaacttcacc 600 

gacaacgcca agaccatcat cgtgcagctg aaggagagcg tggagatcaa ctgcacccgc 660 

cccaacaaca acacccgcaa gagcatcacc atcggccccg gccgcgcctt ctacgccacc 720 

ggcgacatca tcggcgacat ccgccaggcc cactgcaaca tcagcggcga gaagtggaac 780 

aacaccctga agcagatcgt gaccaagctg caggcccagt tcggcaacaa gaccatcgtg 840 

ttcaagcaga gcagcggcgg cgaccccgag atcgtgatgc acagcttcaa ctgcggcggc 900 

gagttcttct actgcaacag cacccagctg ttcaacagca cctggaacaa caccatcggc 960 

cccaacaaca ccaacggcac catcaccctg ccctgccgca tcaagcagat catcaaccgc 1020 

tggcaggagg tgggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 

agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 

gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 

tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 12*0 

gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 

gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 1380 

agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 1440 

ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 

tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 



9 



WO 00/39303 



PCT/US99/31272 



accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 

atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 

atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 1740 

tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 

atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 

aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 

ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 

agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg. 2040 

ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 

ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 

gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 

accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 

cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg agcgtgct 2328 

<210> 7 
<211> 2334 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Leul22 -Serl99 
<400> 7 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 84 0 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgctggc aggaggtggg caaggccatg tacgcccccc ccatccgcgg ccagatccgc 1080 
tgcagcagca acatcaccgg cctgctgctg acccgcgacg gcggcaagga gatcagcaac 1140 
accaccgaga tcttccgccc cggcggcggc gacatgcgcg acaactggcg cagcgagctg 1200 
tacaagtaca aggtggtgaa gatcgagccc ctgggcgtgg cccccaccaa ggccaagcgc 1260 
cgcgtggtgc agcgcgagaa gcgcgccgtg accctgggcg ccatgttcct gggcttcctg 1320 
ggcgccgccg gcagcaccat gggcgcccgc agcctgaccc tgaccgtgca ggcccgccag 1380 
ctgctgagcg gcatcgtgca gcagcagaac aacctgctgc gcgccatcga ggcccagcag 1440 
cacctgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcgt gctggccgtg 1500 
gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg caagctgatc 1560 
tgcaccaccg ccgtgccctg gaacgccagc tggagcaaca agagcctgga ccagatctgg 1620 
aacaacatga cctggatgga gtgggagcgc gagatcgaca actacaccaa cctgatctac 1680 
accctgatcg aggagagcca gaaccagcag gagaagaacg agcaggagct gctggagctg 1740 
gacaagtggg ccagcctgtg gaactggttc gacatcagca agtggctgtg gtacatcaag 1800 
atcttcatca tgatcgtggg cggcctggtg ggcctgcgca tcgtgttcac cgtgctgagc 1860 
atcgtgaacc gcgtgcgcca gggctacagc cccctgagct tccagacccg cttccccgcc 1920 
ccccgcggcc ccgaccgccc cgagggcatc gaggaggagg gcggcgagcg cgaccgcgac 1980 
cgcagcagcc ccctggtgca cggcctgctg gccctgatct gggacgacct gcgcagcctg 2040 
tgcctgttca gctaccaccg cctgcgcgac ctgatcctga tcgccgcccg catcgtggag 2100 
ctgctgggcc gccgcggctg ggaggccctg aagtactggg gcaacctgct gcagtactgg 2160 
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atccaggagc tgaagaacag cgccgtgagc 
gagggcaccg accgcatcat cgaggtggcc 
ccccgccgca tccgccaggg cttcgagcgc 

<210> 8 
<211> 2316 
<212> DNA 

<213> Artificial Sequence 



ctgttcgacg ccatcgccat cgccgtggcc 2220 
cagcgcatcg gccgcgcctt cctgcacatc 2280 
gccctgctgt aactcgagcg tgct 2334 



<220> 

<223> Description of Artificial Sequence: Vall20-Thr202 



<400> 8 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
gccacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcaa ccgctggcag 1020 
gaggtgggca aggccatgta cgcccccccc atccgcggcc agatccgctg cagcagcaac 1080 
atcaccggcc tgctgctgac ccgcgacggc ggcaaggaga tcagcaacac caccgagatc 1140 
ttccgccccg gcggcggcga catgcgcgac aactggcgca gcgagctgta caagtacaag 1200 
gtggtgaaga tcgagcccct gggcgtggcc cccaccaagg ccaagcgccg cgtggtgcag 1260 
cgcgagaagc gcgccgtgac cctgggcgcc atgttcctgg gcttcctggg cgccgccggc 1320 
agcaccatgg gcgcccgcag cctgaccctg accgtgcagg cccgccagct gctgagcggc 1380 
atcgtgcagc agcagaacaa cctgctgcgc gccatcgagg cccagcagca cctgctgcag 144 0 
ctgaccgtgt ggggcatcaa gcagctgcag gcccgcgtgc tggccgtgga gcgctacctg 1500 
aaggaccagc agctgctggg catctggggc tgcagcggca agctgatctg caccaccgcc 1560 
gtgccctgga acgccagctg gagcaacaag agcctggacc agatctggaa caacatgacc 1620 
tggatggagt gggagcgcga gatcgacaac tacaccaacc tgatctacac cctgatcgag 1680 
gagagccaga accagcagga gaagaacgag caggagctgc tggagctgga caagtgggcc 1740 
agcctgtgga actggttcga catcagcaag tggctgtggt acatcaagat cttcatcatg 1800 
atcgtgggcg gcctggtggg cctgcgcatc gtgttcaccg tgctgagcat cgtgaaccgc 1860 
gtgcgccagg gctacagccc cctgagcttc cagacccgct tccccgcccc ccgcggcccc 1920 
gaccgccccg agggcatcga ggaggagggc ggcgagcgcg accgcgaccg cagcagcccc 1980 
ctggtgcacg gcctgctggc cctgatctgg gacgacctgc gcagcctgtg cctgttcagc 2040 
taccaccgcc tgcgcgacct gatcctgatc gccgcccgca tcgtggagct gctgggccgc 2100 
cgcggctggg aggccctgaa gtactggggc aacctgctgc agtactggat ccaggagctg 2160 
aagaacagcg ccgtgagcct gttcgacgcc atcgccatcg ccgtggccga gggcaccgac 2220 
cgcatcatcg aggtggccca gcgcatcggc cgcgccttcc tgcacatccc ccgccgcatc 2280 
cgccagggct tcgagcgcgc cctgctgtaa ctcgag 2316 

<210> 9 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: Trp427-Gly431 

<400> 9 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 

atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgctgggg cggcaaggcc 1260 

atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 

ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 

ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 

cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 

gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 

cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 

aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 

atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 1740 

ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 

agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 

cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 

caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 

ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 

gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 

agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 

atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 

ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 

gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 

ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 2400 

agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 

gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 

<210> 10 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426-Gly431 
<400> 10 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
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accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 3 60 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 

atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgcggcgg cggcaaggcc 1260 

atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 

ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 

ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 

cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 

gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 

cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 

aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 

atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 1740 

ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 

agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 

cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 

caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 

ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 

gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 

agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 

atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 

ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 22 80 

gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 

ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 24 00 

agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 

gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 
cgcgccctgc tgtaactcga g 2541 

<210> 11 
<211> 2541 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Arg426-Gly431B 
<400> 11 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 

agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 

agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
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atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 

gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 

gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 

accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 

agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 

atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 

cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgcggcag cggcaaggcc 1260 

atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 

ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 

ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 

cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 

gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 

cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 

aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 

atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 1740 

ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 

agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 

cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 

caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 

ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 

gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 

agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 

atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 

ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 

gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 

ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 2400 

agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 

gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 

cgcgccctgc tgtaactcga g 2541 

<210> 12 
<211> 2541 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Arg426-Lys432 
<400> 12 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
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cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 

gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 

atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 

ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 

aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 

ggcaccatca ccctgccctg ccgcatcaag cagatcatca accgcggcgg caacaaggcc 1260 

atgtacgccc cccccatccg cggccagatc cgctgcagca gcaacatcac cggcctgctg 1320 

ctgacccgcg acggcggcaa ggagatcagc aacaccaccg agatcttccg ccccggcggc 1380 

ggcgacatgc gcgacaactg gcgcagcgag ctgtacaagt acaaggtggt gaagatcgag 1440 

cccctgggcg tggcccccac caaggccaag cgccgcgtgg tgcagcgcga gaagcgcgcc 1500 

gtgaccctgg gcgccatgtt cctgggcttc ctgggcgccg ccggcagcac catgggcgcc 1560 

cgcagcctga ccctgaccgt gcaggcccgc cagctgctga gcggcatcgt gcagcagcag 1620 

aacaacctgc tgcgcgccat cgaggcccag cagcacctgc tgcagctgac cgtgtggggc 1680 

atcaagcagc tgcaggcccg cgtgctggcc gtggagcgct acctgaagga ccagcagctg 1740 

ctgggcatct ggggctgcag cggcaagctg atctgcacca ccgccgtgcc ctggaacgcc 1800 

agctggagca acaagagcct ggaccagatc tggaacaaca tgacctggat ggagtgggag 1860 

cgcgagatcg acaactacac caacctgatc tacaccctga tcgaggagag ccagaaccag 1920 

caggagaaga acgagcagga gctgctggag ctggacaagt gggccagcct gtggaactgg 1980 

ttcgacatca gcaagtggct gtggtacatc aagatcttca tcatgatcgt gggcggcctg 2040 

gtgggcctgc gcatcgtgtt caccgtgctg agcatcgtga accgcgtgcg ccagggctac 2100 

agccccctga gcttccagac ccgcttcccc gccccccgcg gccccgaccg ccccgagggc 2160 

atcgaggagg agggcggcga gcgcgaccgc gaccgcagca gccccctggt gcacggcctg 2220 

ctggccctga tctgggacga cctgcgcagc ctgtgcctgt tcagctacca ccgcctgcgc 2280 

gacctgatcc tgatcgccgc ccgcatcgtg gagctgctgg gccgccgcgg ctgggaggcc 2340 

ctgaagtact ggggcaacct gctgcagtac tggatccagg agctgaagaa cagcgccgtg 2400 

agcctgttcg acgccatcgc catcgccgtg gccgagggca ccgaccgcat catcgaggtg 2460 

gcccagcgca tcggccgcgc cttcctgcac atcccccgcc gcatccgcca gggcttcgag 2520 

cgcgccctgc tgtaactcga g 2541 

<210> 13 
<211> 2535 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Asn425-Lys432 
<400> 13 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcocc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 9^0 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatca acgcccccaa ggccatgtac 1260 
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gcccccccca tccgcggcca gatccgctgc agcagcaaca tcaccggcct gctgctgacc 1320 
cgcgacggcg gcaaggagat cagcaacacc accgagatct tccgccccgg cggcggcgac 13 80 
atgcgcgaca actggcgcag cgagctgtac aagtacaagg tggtgaagat cgagcccctg 1440 
ggcgtggccc ccaccaaggc caagcgccgc gtggtgcagc gcgagaagcg cgccgtgacc 1500 
ctgggcgcca tgttcctggg cttcctgggc gccgccggca gcaccatggg cgcccgcagc 1560 
ctgaccctga ccgtgcaggc ccgccagctg ctgagcggca tcgtgcagca gcagaacaac 1620 
ctgctgcgcg ccatcgaggc ccagcagcac ctgctgcagc tgaccgtgtg gggcatcaag 1680 
cagctgcagg cccgcgtgct ggccgtggag cgctacctga aggaccagca gctgctgggc 1740 
atctggggct gcagcggcaa gctgatctgc accaccgccg tgccctggaa cgccagctgg 1800 
agcaacaaga gcctggacca gatctggaac aacatgacct ggatggagtg ggagcgcgag 1860 
atcgacaact acaccaacct gatctacacc ctgatcgagg agagccagaa ccagcaggag 1920 
aagaacgagc aggagctgct ggagctggac aagtgggcca gcctgtggaa ctggttcgac 1980 
atcagcaagt ggctgtggta catcaagatc ttcatcatga tcgtgggcgg cctggtgggc 2040 
ctgcgcatcg tgttcaccgt gctgagcatc gtgaaccgcg tgcgccaggg ctacagcccc 2100 
ctgagcttcc agacccgctt ccccgccccc cgcggccccg accgccccga gggcatcgag 2160 
gaggagggcg gcgagcgcga ccgcgaccgc agcagccccc tggtgcacgg cctgctggcc 2220 
ctgatctggg acgacctgcg cagcctgtgc ctgttcagct accaccgcct gcgcgacctg 2280 
atcctgatcg ccgcccgcat cgtggagctg ctgggccgcc gcggctggga ggccctgaag 2340 
tactggggca acctgctgca gtactggatc caggagctga agaacagcgc cgtgagcctg 2400 
ttcgacgcca tcgccatcgc cgtggccgag ggcaccgacc gcatcatcga ggtggcccag 2460 
cgcatcggcc gcgccttcct gcacatcccc cgccgcatcc gccagggctt cgagcgcgcc 2520 
ctgctgtaac tcgag 2535 

<210> 14 
<211> 2529 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Ile424-Ala433 
<400> 14 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 4 80 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagatcatcg gcggcgccat gtacgccccc 1260 
cccatccgcg gccagatccg ctgcagcagc aacatcaccg gcctgctgct gacccgcgac 1320 
ggcggcaagg agatcagcaa caccaccgag atcttccgcc ccggcggcgg cgacatgcgc 13 80 
gacaactggc gcagcgagct gtacaagtac aaggtggtga agatcgagcc cctgggcgtg 1440 
gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gaccctgggc 1500 
gccatgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgcccg cagcctgacc 1560 
ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagaa caacctgctg 1620 
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cgcgccatcg aggcccagca gcacctgctg cagctgaccg tgtggggcat caagcagctg 1680 

caggcccgcg tgctggccgt ggagcgctac ctgaaggacc agcagctgct gggcatctgg 1740 

ggctgcagcg gcaagctgat ctgcaccacc gccgtgccct ggaacgccag ctggagcaac 1800 

aagagcctgg accagatctg gaacaacatg acctggatgg agtgggagcg cgagatcgac 1860 

aactacacca acctgatcta caccctgatc gaggagagcc agaaccagca ggagaagaac 1920 

gagcaggagc tgctggagct ggacaagtgg gccagcctgt ggaactggtt cgacatcagc 1980 

aagtggctgt ggtacatcaa gatcttcatc atgatcgtgg gcggcctggt gggcctgcgc 2040 

atcgtgttca ccgtgctgag catcgtgaac cgcgtgcgcc agggctacag ccccctgagc 2100 

ttccagaccc gcttccccgc cccccgcggc cccgaccgcc ccgagggcat cgaggaggag 2160 

ggcggcgagc gcgaccgcga ccgcagcagc cccctggtgc acggcctgct ggccctgatc 2220 

tgggacgacc tgcgcagcct gtgcctgttc agctaccacc gcctgcgcga cctgatcctg 2280 

atcgccgccc gcatcgtgga gctgctgggc cgccgcggct gggaggccct gaagtactgg 2340 

ggcaacctgc tgcagtactg gatccaggag ctgaagaaca gcgccgtgag cctgttcgac 2400 

gccatcgcca tcgccgtggc cgagggcacc gaccgcatca tcgaggtggc ccagcgcatc 2460 

ggccgcgcct tcctgcacat cccccgccgc atccgccagg gcttcgagcg cgccctgctg 2520 

taactcgag 2529 

<210> 15 
<211> 2523 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Ile423 -Met434 



<400> 15 

gaattcgcca ccatggatgc aatgaagaga 
gcagtcttcg tttcgcccag cgccgtggag 
cccgtgtgga aggaggccac caccaccctg 
accgaggtgc acaacgtgtg ggccacccac 
gagatcgtgc tggagaacgt gaccgagaac 
cagatgcacg aggacatcat cagcctgtgg 
acccccctgt gcgtgaccct gcactgcacc 
agcaactgga aggagatgga ccgcggcgag 
agcatccgca acaagatgca gaaggagtac 
atcgacaacg acaacaccag ctacaagctg 
gcctgcccca aggtgagctt cgagcccatc 
gccatcctga agtgcaacga caagaagttc 
accgtgcagt gcacccacgg catccgcccc 
agcctggccg aggagggcgt ggtgatccgc 
atcatcgtgc agctgaagga gagcgtggag 
cgcaagagca tcaccatcgg ccccggccgc 
gacatccgcc aggcccactg caacatcagc 
atcgtgacca agctgcaggc ccagttcggc 
ggcggcgacc ccgagatcgt gatgcacagc 
aacagcaccc agctgttcaa cagcacctgg 
ggcaccatca ccctgccctg ccgcatcaag 
cgcggccaga tccgctgcag cagcaacatc 
aaggagatca gcaacaccac cgagatcttc 
tggcgcagcg agctgtacaa gtacaaggtg 
accaaggcca agcgccgcgt ggtgcagcgc 
ttcctgggct tcctgggcgc cgccggcagc 
gtgcaggccc gccagctgct gagcggcatc 
atcgaggccc agcagcacct gctgcagctg 
cgcgtgctgg ccgtggagcg ctacctgaag 
agcggcaagc tgatctgcac caccgccgtg 
ctggaccaga tctggaacaa catgacctgg 
accaacctga tctacaccct gatcgaggag 
gagctgctgg agctggacaa gtgggccagc 



gggctctgct gtgtgctgct gctgtgtgga 60 
aagctgtggg tgaccgtgta ctacggcgtg 120 
ttctgcgcca gcgacgccaa ggcctacgac 180 
gcctgcgtgc ccaccgaccc caacccccag 240 
ttcaacatgt ggaagaacaa catggtggag 300 
gaccagagcc tgaagccctg cgtgaagctg 360 
aacctgaaga acgccaccaa caccaagagc 420 
atcaagaact gcagcttcaa ggtgaccacc 480 
gccctgttct acaagctgga cgtggtgccc 540 
atcaactgca acaccagcgt gatcacccag 600 
cccatccact actgcgcccc cgccggcttc 660 
aacggcagcg gcccctgcac caacgtgagc 720 
gtggtgagca cccagctgct gctgaacggc 780 
agcgagaact tcaccgacaa cgccaagacc 84 0 
atcaactgca cccgccccaa caacaacacc 900 
gccttctacg ccaccggcga catcatcggc 960 
ggcgagaagt ggaacaacac cctgaagcag 1020 
aacaagacca tcgtgttcaa gcagagcagc 1080 
ttcaactgcg gcggcgagtt cttctactgc 1140 
aacaacacca tcggccccaa caacaccaac 1200 
cagatcggcg gcatgtacgc cccccccatc 1260 
accggcctgc tgctgacccg cgacggcggc 1320 
cgccccggcg gcggcgacat gcgcgacaac 13 80 
gtgaagatcg agcccctggg cgtggccccc 1440 
gagaagcgcg ccgtgaccct gggcgccatg 1500 
accatgggcg cccgcagcct gaccctgacc 1560 
gtgcagcagc agaacaacct gctgcgcgcc 1620 
accgtgtggg gcatcaagca gctgcaggcc 1680 
gaccagcagc tgctgggcat ctggggctgc 1740 
ccctggaacg ccagctggag caacaagagc 1800 
atggagtggg agcgcgagat cgacaactac 1860 
agccagaacc agcaggagaa gaacgagcag 1920 
ctgtggaact ggttcgacat cagcaagtgg 1980 
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ctgtggtaca tcaagatctt catcatgatc gtgggcggcc tggtgggcct gcgcatcgtg 2040 

ttcac.cgtgc tgagcatcgt gaaccgcgtg cgccagggct acagccccct gagcttccag 2100 

acccgcttcc ccgccccccg cggccccgac cgccccgagg gcatcgagga ggagggcggc 2160 

gagcgcgacc gcgaccgcag cagccccctg gtgcacggcc tgctggccct gatctgggac 2220 

gacctgcgca gcctgtgcct gttcagctac caccgcctgc gcgacctgat cctgatcgcc 2280 

gcccgcatcg tggagctgct gggccgccgc ggctgggagg ccctgaagta ctggggcaac 2340 

ctgctgcagt actggatcca ggagctgaag aacagcgccg tgagcctgtt cgacgccatc 2400 

gccatcgccg tggccgaggg caccgaccgc atcatcgagg tggcccagcg catcggccgc 2460 

gccttcctgc acatcccccg ccgcatccgc cagggcttcg agcgcgccct gctgtaactc 2520 



<210> 16 
<211> 2517 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Gln422-Tyr435 



<400> 16 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag cagggcggct acgccccccc catccgcggc 1260 
cagatccgct gcagcagcaa catcaccggc ctgctgctga cccgcgacgg cggcaaggag 1320 
atcagcaaca ccaccgagat cttccgcccc ggcggcggcg acatgcgcga caactggcgc 13 80 
agcgagctgt acaagtacaa ggtggtgaag atcgagcccc tgggcgtggc ccccaccaag 1440 
gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtga ccctgggcgc catgttcctg 1500 
ggcttcctgg gcgccgccgg cagcaccatg ggcgcccgca gcctgaccct gaccgtgcag 1560 
gcccgccagc tgctgagcgg catcgtgcag cagcagaaca acctgctgcg cgccatcgag 1620 
gcccagcagc acctgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 1680 
ctggccgtgg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 1740 
aagctgatct gcaccaccgc cgtgccctgg aacgccagct ggagcaacaa gagcctggac 1800 
cagatctgga acaacatgac ctggatggag tgggagcgcg agatcgacaa ctacaccaac 1860 
ctgatctaca ccctgatcga ggagagccag aaccagcagg agaagaacga gcaggagctg 1920 
ctggagctgg acaagtgggc cagcctgtgg aactggttcg acatcagcaa gtggctgtgg 1980 
tacatcaaga tcttcatcat gatcgtgggc ggcctggtgg gcctgcgcat cgtgttcacc 2040 
gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagacccgc 2100 
ttccccgccc cccgcggccc cgaccgcccc gagggcatcg aggaggaggg cggcgagcgc 2160 
gaccgcgacc gcagcagccc cctggtgcac ggcctgctgg ccctgatctg ggacgacctg 2220 
cgcagcctgt gcctgttcag ctaccaccgc ctgcgcgacc tgatcctgat cgccgcccgc 2280 
atcgtggagc tgctgggccg ccgcggctgg gaggccctga agtactgggg caacctgctg 2340 
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cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 2400 
gccgtggccg agggcaccga ccgcatcatc gaggtggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 17 
<211> 2517 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Gln422-Tyr435B 
<400> 17 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgaccct gcactgcacc aacctgaaga acgccaccaa caccaagagc 420 
agcaactgga aggagatgga ccgcggcgag atcaagaact gcagcttcaa ggtgaccacc 480 
agcatccgca acaagatgca gaaggagtac gccctgttct acaagctgga cgtggtgccc 540 
atcgacaacg acaacaccag ctacaagctg atcaactgca acaccagcgt gatcacccag 600 
gcctgcccca aggtgagctt cgagcccatc cccatccact actgcgcccc cgccggcttc 660 
gccatcctga agtgcaacga caagaagttc aacggcagcg gcccctgcac caacgtgagc 720 
accgtgcagt gcacccacgg catccgcccc gtggtgagca cccagctgct gctgaacggc 780 
agcctggccg aggagggcgt ggtgatccgc agcgagaact tcaccgacaa cgccaagacc 840 
atcatcgtgc agctgaagga gagcgtggag atcaactgca cccgccccaa caacaacacc 900 
cgcaagagca tcaccatcgg ccccggccgc gccttctacg ccaccggcga catcatcggc 960 
gacatccgcc aggcccactg caacatcagc ggcgagaagt ggaacaacac cctgaagcag 1020 
atcgtgacca agctgcaggc ccagttcggc aacaagacca tcgtgttcaa gcagagcagc 1080 
ggcggcgacc ccgagatcgt gatgcacagc ttcaactgcg gcggcgagtt cttctactgc 1140 
aacagcaccc agctgttcaa cagcacctgg aacaacacca tcggccccaa caacaccaac 1200 
ggcaccatca ccctgccctg ccgcatcaag caggccccct acgccccccc catccgcggc 1260 
cagatccgct gcagcagcaa catcaccggc ctgctgctga cccgcgacgg cggcaaggag 1320 
atcagcaaca ccaccgagat cttccgcccc ggcggcggcg acatgcgcga caactggcgc 1380 
agcgagctgt acaagtacaa ggtggtgaag atcgagcccc tgggcgtggc ccccaccaag 1440 
gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtga ccctgggcgc catgttcctg 1500 
ggcttcctgg gcgccgccgg cagcaccatg ggcgcccgca gcctgaccct gaccgtgcag 1560 
gcccgccagc tgctgagcgg catcgtgcag cagcagaaca acctgctgcg cgccatcgag 1620 
gcccagcagc acctgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 1680 
ctggccgtgg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 1740 
aagctgatct gcaccaccgc cgtgccctgg aacgccagct ggagcaacaa gagcctggac 1800 
cagatctgga acaacatgac ctggatggag tgggagcgcg agatcgacaa ctacaccaac 1860 
ctgatctaca ccctgatcga ggagagccag aaccagcagg agaagaacga gcaggagctg 1920 
ctggagctgg acaagtgggc cagcctgtgg aactggttcg acatcagcaa gtggctgtgg 1980 
tacatcaaga tcttcatcat gatcgtgggc ggcctggtgg gcctgcgcat cgtgttcacc 2040 
gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagacccgc 2100 
ttccccgccc cccgcggccc cgaccgcccc gagggcatcg aggaggaggg cggcgagcgc 2160 
gaccgcgacc gcagcagccc cctggtgcac ggcctgctgg ccctgatctg ggacgacctg 2220 
cgcagcctgt gcctgttcag ctaccaccgc ctgcgcgacc tgatcctgat cgccgcccgc 2280 
atcgtggagc tgctgggccg ccgcggctgg gaggccctga agtactgggg caacctgctg 2340 
cagtactgga tccaggagct gaagaacagc gccgtgagcc tgttcgacgc catcgccatc 2400 
gccgtggccg agggcaccga ccgcatcatc gaggtggccc agcgcatcgg ccgcgccttc 2460 
ctgcacatcc cccgccgcat ccgccagggc ttcgagcgcg ccctgctgta actcgag 2517 

<210> 18 
<211> 2322 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22-Serl99; 
Arg426-Gly431 

<400> 18 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 

tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 4 80 

ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 

acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 

ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 

acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 

gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 

tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 840 

atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 

ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 

atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 

aaccgcggcg gcggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 

agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 

gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 

tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 

gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 

gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 1380 

agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 1440 

ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 

tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 

accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 

atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 

atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 1740 

tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 

atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 

aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 

ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 

agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 

ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 

ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 

gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 

accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 

cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 

<210> 19 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22-Serl99; 
Arg426-Lys432 

<400> 19 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
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cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 840 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgcggcg gcaacaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 1380 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca gcagcacctg 1440 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 1740 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 
gagctgaaga acagcgccgt gagcotgttc gacgccatcg ccatcgccgt ggccgagggc 2220 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 

<210> 20 
<211> 2322 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Leul22-Serl99; 
Trp427-Gly431 

<400> 20 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
ggcaacagcg tgatcaccca ggcctgcccc aaggtgagct tcgagcccat ccccatccac 420 
tactgcgccc ccgccggctt cgccatcctg aagtgcaacg acaagaagtt caacggcagc 480 
ggcccctgca ccaacgtgag caccgtgcag tgcacccacg gcatccgccc cgtggtgagc 540 
acccagctgc tgctgaacgg cagcctggcc gaggagggcg tggtgatccg cagcgagaac 600 
ttcaccgaca acgccaagac catcatcgtg cagctgaagg agagcgtgga gatcaactgc 660 
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acccgcccca acaacaacac ccgcaagagc atcaccatcg gccccggccg cgccttctac 720 
gccaccggcg acatcatcgg cgacatccgc caggcccact gcaacatcag cggcgagaag 780 
tggaacaaca ccctgaagca gatcgtgacc aagctgcagg cccagttcgg caacaagacc 840 
atcgtgttca agcagagcag cggcggcgac cccgagatcg tgatgcacag cttcaactgc 900 
ggcggcgagt tcttctactg caacagcacc cagctgttca acagcacctg gaacaacacc 960 
atcggcccca acaacaccaa cggcaccatc accctgccct gccgcatcaa gcagatcatc 1020 
aaccgctggg gcggcaaggc catgtacgcc ccccccatcc gcggccagat ccgctgcagc 1080 
agcaacatca ccggcctgct gctgacccgc gacggcggca aggagatcag caacaccacc 1140 
gagatcttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 1200 
tacaaggtgg tgaagatcga gcccctgggc gtggccccca ccaaggccaa gcgccgcgtg 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg ggcgccatgt tcctgggctt cctgggcgcc 1320 
gccggcagca ccatgggcgc ccgcagcctg accctgaccg tgcaggcccg ccagctgctg 1380 
agcggcatcg tgcagcagca gaacaacctg ctgcgcgcca tcgaggccca. gcagcacctg 1440 
ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc cgtggagcgc 1500 
tacctgaagg accagcagct gctgggcatc tggggctgca gcggcaagct gatctgcacc 1560 
accgccgtgc cctggaacgc cagctggagc aacaagagcc tggaccagat ctggaacaac 1620 
atgacctgga tggagtggga gcgcgagatc gacaactaca ccaacctgat ctacaccctg 1680 
atcgaggaga gccagaacca gcaggagaag aacgagcagg agctgctgga gctggacaag 1740 
tgggccagcc tgtggaactg gttcgacatc agcaagtggc tgtggtacat caagatcttc 1800 
atcatgatcg tgggcggcct ggtgggcctg cgcatcgtgt tcaccgtgct gagcatcgtg 1860 
aaccgcgtgc gccagggcta cagccccctg agcttccaga cccgcttccc cgccccccgc 1920 
ggccccgacc gccccgaggg catcgaggag gagggcggcg agcgcgaccg cgaccgcagc 1980 
agccccctgg tgcacggcct gctggccctg atctgggacg acctgcgcag cctgtgcctg 2040 
ttcagctacc accgcctgcg cgacctgatc ctgatcgccg cccgcatcgt ggagctgctg 2100 
ggccgccgcg gctgggaggc cctgaagtac tggggcaacc tgctgcagta ctggatccag 2160 
gagctgaaga acagcgccgt gagcctgttc gacgccatcg ccatcgccgt ggccgagggc 2220 
accgaccgca tcatcgaggt ggcccagcgc atcggccgcg ccttcctgca catcccccgc 2280 
cgcatccgcc agggcttcga gcgcgccctg ctgtaactcg ag 2322 

<210> 21 
<211> 2310 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Lysl21-Val200; 
Asn425-Lys432 

<400> 21 

gaattcgcca ccatggatgc 
gcagtcttcg tttcgcccag 
cccgtgtgga aggaggccac 
accgaggtgc acaacgtgtg 
gagatcgtgc tggagaacgt 
cagatgcacg aggacatcat 
cccgtgatca cccaggcctg 
gcccccgccg gcttcgccat 
tgcaccaacg tgagcaccgt 
ctgctgctga acggcagcct 
gacaacgcca agaccatcat 
cccaacaaca acacccgcaa 
ggcgacatca tcggcgacat 
aacaccctga agcagatcgt 
ttcaagcaga gcagcggcgg 
gagttcttct actgcaacag 
cccaacaaca ccaacggcac 
cccaaggcca tgtacgcccc 
ggcctgctgc tgacccgcga 
cccggcggcg gcgacatgcg 



aatgaagaga gggctctgct 
cgccgtggag aagctgtggg 
caccaccctg ttctgcgcca 
ggccacccac gcctgcgtgc 
gaccgagaac ttcaacatgt 
cagcctgtgg gaccagagcc 
ccccaaggtg agcttcgagc 
cctgaagtgc aacgacaaga 
gcagtgcacc cacggcatcc 
ggccgaggag ggcgtggtga 
cgtgcagctg aaggagagcg 
gagcatcacc atcggccccg 
ccgccaggcc cactgcaaca 
gaccaagctg caggcccagt 
cgaccccgag atcgtgatgc 
cacccagctg ttcaacagca 
catcaccctg ccctgccgca 
ccccatccgc ggccagatcc 
cggcggcaag gagatcagca 
cgacaactgg cgcagcgagc 



gtgtgctgct gctgtgtgga 60 
tgaccgtgta ctacggcgtg 120 
gcgacgccaa ggcctacgac 180 
ccaccgaccc caacccccag 240 
ggaagaacaa catggtggag 300 
tgaagccctg cgtgaaggcc 360 
ccatccccat ccactactgc 420 
agttcaacgg cagcggcccc 4 80 
gccccgtggt gagcacccag 540 
tccgcagcga gaacttcacc 600 
tggagatcaa ctgcacccgc 660 
gccgcgcctt ctacgccacc 720 
tcagcggcga gaagtggaac 780 
tcggcaacaa gaccatcgtg 840 
acagcttcaa ctgcggcggc 900 
cctggaacaa caccatcggc 960 
tcaagcagat catcaacgcc 1020 
gctgcagcag caacatcacc 1080 
acaccaccga gatcttccgc 1140 
tgtacaagta caaggtggtg 1200 
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aagatcgagc 
aagcgcgccg 
atgggcgccc 
cagcagcaga 
gtgtggggca 
cagcagctgc 
tggaacgcca 
gagtgggagc 
cagaaccagc 
tggaactggt 
ggcggcctgg 
cagggctaca 
cccgagggca 
cacggcctgc 
cgcctgcgcg 
tgggaggccc 
agcgccgtga 
atcgaggtgg 
ggcttcgagc 

<210> 22 
<211> 2298 
<212> DNA 
<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Vall20-Ile201; 
Ile424-Ala433 

<400> 22 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 
atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 
gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 
aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 
ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 
gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 
aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 
atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 
ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 
cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 
ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 
aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcgg cggcgccatg 1020 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1080 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1140 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1200 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1320 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1380 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1440 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1500 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1560 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1620 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1680 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1740 



ccctgggcgt ggcccccacc aaggccaagc 
tgaccctggg cgccatgttc ctgggcttcc 
gcagcctgac cctgaccgtg caggcccgcc 
acaacctgct gcgcgccatc gaggcccagc 
tcaagcagct gcaggcccgc gtgctggccg 
tgggcatctg gggctgcagc ggcaagctga 
gctggagcaa caagagcctg gaccagatct 
gcgagatcga caactacacc aacctgatct 
aggagaagaa cgagcaggag ctgctggagc 
tcgacatcag caagtggctg tggtacatca 
tgggcctgcg catcgtgttc accgtgctga 
gccccctgag cttccagacc cgcttccccg 
tcgaggagga gggcggcgag cgcgaccgcg 
tggccctgat ctgggacgac ctgcgcagcc 
acctgatcct gatcgccgcc cgcatcgtgg 
tgaagtactg gggcaacctg ctgcagtact 
gcctgttcga cgccatcgcc atcgccgtgg 
cccagcgcat cggccgcgcc ttcctgcaca 
gcgccctgct gtaactcgag 



gccgcgtggt gcagcgcgag 1260 
tgggcgccgc cggcagcacc 1320 
agctgctgag cggcatcgtg 13 80 
agcacctgct gcagctgacc 1440 
tggagcgcta cctgaaggac 1500 
tctgcaccac cgccgtgccc 1560 
ggaacaacat gacctggatg 1620 
acaccctgat cgaggagagc 1680 
tggacaagtg ggccagcctg 1740 
agatcttcat catgatcgtg 1800 
gcatcgtgaa ccgcgtgcgc 1860 
ccccccgcgg ccccgaccgc 1920 
accgcagcag ccccctggtg 1980 
tgtgcctgtt cagctaccac 2040 
agctgctggg ccgccgcggc 2100 
ggatccagga gctgaagaac 2160 
ccgagggcac cgaccgcatc 2220 
tcccccgccg catccgccag 2280 
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gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1800 

ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1860 

cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1920 

gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 1980 

gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2040 

ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2100 

aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2160 

ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2220 

cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2280 

gccctgctgt aactcgag 229B 

<210> 23 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

Vall20-Ile201B; Ile424 -Ala433 

<400> 23 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgcccggc 360 

atcacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 

gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 

aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 

ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 

gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 

aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 

atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 

ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 

cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 

ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 

aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcgg cggcgccatg 1020 

tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1080 

acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1140 

gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1200 

ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 

accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1320 

agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1380 

aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1440 

aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1500 

ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1560 

tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1620 

gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1680 

gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1740 

gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1800 

ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1860 

cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1920 

gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 1980 

gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2040 

ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2100 

aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2160 

ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2220 

cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2280 
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gccctgctgt aactcgag 2298 

<210> 24 
<211> 2298 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall20-Thr202; 
Ile424-Ala433 

<400> 24 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttjcaacatgt ggaagaacaa catggtggag 3 00 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgggcggc 360 

gccacccagg cctgccccaa ggtgagcttc gagcccatcc ccatccacta ctgcgccccc 420 

gccggcttcg ccatcctgaa gtgcaacgac aagaagttca acggcagcgg cccctgcacc 480 

aacgtgagca ccgtgcagtg cacccacggc atccgccccg tggtgagcac ccagctgctg 540 

ctgaacggca gcctggccga ggagggcgtg gtgatccgca gcgagaactt caccgacaac 600 

gccaagacca tcatcgtgca gctgaaggag agcgtggaga tcaactgcac ccgccccaac 660 

aacaacaccc gcaagagcat caccatcggc cccggccgcg ccttctacgc caccggcgac 720 

atcatcggcg acatccgcca ggcccactgc aacatcagcg gcgagaagtg gaacaacacc 780 

ctgaagcaga tcgtgaccaa gctgcaggcc cagttcggca acaagaccat cgtgttcaag 840 

cagagcagcg gcggcgaccc cgagatcgtg atgcacagct tcaactgcgg cggcgagttc 900 

ttctactgca acagcaccca gctgttcaac agcacctgga acaacaccat cggccccaac 960 

aacaccaacg gcaccatcac cctgccctgc cgcatcaagc agatcatcgg cggcgccatg 1020 

tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1080 

acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1140 

gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1200 

ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1260 

accctgggcg ccatgttcct. gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1320 

agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1380 

aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1440 

aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1500 

ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1560 

tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1620 

gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1680 

gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1740 

gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1800 

ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1860 

cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1920 

gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 1980 

gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2040 

ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2100 

aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2160 

ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2220 

cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2280 

gccctgctgt aactcgag 2298 

<210> 25 
<211> 2358 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall27-Asnl95 
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<400> 25 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 
gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 
cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 
accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 
gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 
cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 
acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 420 
aaggtgagct tcgagcccat ccccatccac tactgcgccc ccgccggctt cgccatcctg 480 
aagtgcaacg acaagaagtt caacggcagc ggcccctgca ccaacgtgag caccgtgcag 540 
tgcacccacg gcatccgccc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 600 
gaggagggcg tggtgatccg cagcgagaac ttcaccgaca acgccaagac catcatcgtg 660 
cagctgaagg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 720 
atcaccatcg gccccggccg cgccttctac gccaccggcg acatcatcgg cgacatccgc 780 
caggcccact gcaacatcag cggcgagaag tggaacaaca ccctgaagca gatcgtgacc 840 
aagctgcagg cccagttcgg caacaagacc atcgtgttca agcagagcag cggcggcgac 900 
cccgagatcg tgatgcacag cttcaactgc ggcggcgagt tcttctactg caacagcacc 960 
cagctgttca acagcacctg gaacaacacc atcggcccca acaacaccaa cggcaccatc 1020 
accctgccct gccgcatcaa gcagatcatc aaccgctggc aggaggtggg caaggccatg 1080 
tacgcccccc ccatccgcgg ccagatccgc tgcagcagca acatcaccgg cctgctgctg 1140 
acccgcgacg gcggcaagga gatcagcaac accaccgaga tcttccgccc cggcggcggc 1200 
gacatgcgcg acaactggcg cagcgagctg tacaagtaca aggtggtgaa gatcgagccc 1260 
ctgggcgtgg cccccaccaa ggccaagcgc cgcgtggtgc agcgcgagaa gcgcgccgtg 1320 
accctgggcg ccatgttcct gggcttcctg ggcgccgccg gcagcaccat gggcgcccgc 1380 
agcctgaccc tgaccgtgca ggcccgccag ctgctgagcg gcatcgtgca gcagcagaac 1440 
aacctgctgc gcgccatcga ggcccagcag cacctgctgc agctgaccgt gtggggcatc 1500 
aagcagctgc aggcccgcgt gctggccgtg gagcgctacc tgaaggacca gcagctgctg 1560 
ggcatctggg gctgcagcgg caagctgatc tgcaccaccg ccgtgccctg gaacgccagc 1620 
tggagcaaca agagcctgga ccagatctgg aacaacatga cctggatgga gtgggagcgc 1680 
gagatcgaca actacaccaa cctgatctac accctgatcg aggagagcca gaaccagcag 1740 
gagaagaacg agcaggagct gctggagctg gacaagtggg ccagcctgtg gaactggttc 1800 
gacatcagca agtggctgtg gtacatcaag atcttcatca tgatcgtggg cggcctggtg 1860 
ggcctgcgca tcgtgttcac cgtgctgagc atcgtgaacc gcgtgcgcca gggctacagc 1920 
cccctgagct tccagacccg cttccccgcc ccccgcggcc ccgaccgccc cgagggcatc 1980 
gaggaggagg gcggcgagcg cgaccgcgac cgcagcagcc ccctggtgca cggcctgctg 2040 
gccctgatct gggacgacct gcgcagcctg tgcctgttca gctaccaccg cctgcgcgac 2100 
ctgatcctga tcgccgcccg catcgtggag ctgctgggcc gccgcggctg ggaggccctg 2160 
aagtactggg gcaacctgct gcagtactgg atccaggagc tgaagaacag cgccgtgagc 2220 
ctgttcgacg ccatcgccat cgccgtggcc gagggcaccg accgcatcat cgaggtggcc 2280 
cagcgcatcg gccgcgcctt cctgcacatc ccccgccgca tccgccaggg cttcgagcgc 2340 
gccctgctgt aactcgag 2 358 

<210> 26 
<211> 2352 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vall27-Asnl95; 
Arg42 6-Gly431 

<400> 26 

gaattcgcca ccatggatgc aatgaagaga gggctctgct gtgtgctgct gctgtgtgga 60 

gcagtcttcg tttcgcccag cgccgtggag aagctgtggg tgaccgtgta ctacggcgtg 120 

cccgtgtgga aggaggccac caccaccctg ttctgcgcca gcgacgccaa ggcctacgac 180 

accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 240 

gagatcgtgc tggagaacgt gaccgagaac ttcaacatgt ggaagaacaa catggtggag 300 

cagatgcacg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 360 

acccccctgt gcgtgggggc agggaactgc aacaccagcg tgatcaccca ggcctgcccc 420 
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aaggtgagct tcgagcccat ccccatccac 
aagtgcaacg acaagaagtt caacggcagc 
tgcacccacg gcatccgccc cgtggtgagc 
gaggagggcg tggtgatccg cagcgagaac 
cagctgaagg agagcgtgga gatcaactgc 
atcaccatcg gccccggccg cgccttctac 
caggcccact gcaacatcag cggcgagaag 
aagctgcagg cccagttcgg caacaagacc 
cccgagatcg tgatgcacag cttcaactgc 
cagctgttca acagcacctg gaacaacacc 
accctgccct gccgcatcaa gcagatcatc 
ccccccatcc gcggccagat ccgctgcagc 
gacggcggca aggagatcag caacaccacc 
cgcgacaact ggcgcagcga gctgtacaag 
gtggccccca ccaaggccaa gcgccgcgtg 
ggcgccatgt tcctgggctt cctgggcgcc 
accctgaccg tgcaggcccg ccagctgctg 
ctgcgcgcca tcgaggccca gcagcacctg 
ctgcaggccc gcgtgctggc cgtggagcgc 
tggggctgca gcggcaagct gatctgcacc 
aacaagagcc tggaccagat ctggaacaac 
gacaactaca ccaacctgat ctacaccctg 
aacgagcagg agctgctgga gctggacaag 
agcaagtggc tgtggtacat caagatcttc 
cgcatcgtgt tcaccgtgct gagcatcgtg 
agcttccaga cccgcttccc cgccccccgc 
gagggcggcg agcgcgaccg cgaccgcagc 
atctgggacg acctgcgcag cctgtgcctg 
ctgatcgccg cccgcatcgt ggagctgctg 
tggggcaacc tgctgcagta ctggatccag 
gacgccatcg ccatcgccgt ggccgagggc 
atcggccgcg ccttcctgca catcccccgc 
ctgtaactcg ag 



tactgcgccc ccgccggctt cgccatcctg 480 
ggcccctgca ccaacgtgag caccgtgcag 540 
acccagctgc tgctgaacgg cagcctggcc 600 
ttcaccgaca acgccaagac catcatcgtg 660 
acccgcccca acaacaacac ccgcaagagc 720 
gccaccggcg acatcatcgg cgacatccgc 780 
tggaacaaca ccctgaagca gatcgtgacc 840 
atcgtgttca agcagagcag cggcggcgac 900 
ggcggcgagt tcttctactg caacagcacc 960 
atcggcccca acaacaccaa cggcaccatc 1020 
aaccgcggcg gcggcaaggc catgtacgcc 1080 
agcaacatca ccggcctgct gctgacccgc 1140 
gagatcttcc gccccggggg cggcgacatg 1200 
tacaaggtgg tgaagatcga gcccctgggc 1260 
gtgcagcgcg agaagcgcgc cgtgaccctg 1320 
gccggcagca ccatgggcgc ccgcagcctg 13 80 
agcggcatcg tgcagcagca gaacaacctg 1440 
ctgcagctga ccgtgtgggg catcaagcag 1500 
tacctgaagg accagcagct gctgggcatc 1560 
accgccgtgc cctggaacgc cagctggagc 1620 
atgacctgga tggagtggga gcgcgagatc 1680 
atcgaggaga gccagaacca gcaggagaag 1740 
tgggccagcc tgtggaactg gttcgacatc 1800 
atcatgatcg tgggcggcct ggtgggcctg 1860 
aaccgcgtgc gccagggcta cagccccctg 1920 
ggccccgacc gccccgaggg catcgaggag 1980 
agccccctgg tgcacggcct gctggccctg 2040 
ttcagctacc accgcctgcg cgacctgatc 2100 
ggccgccgcg gctgggaggc cctgaagtac 2160 
gagctgaaga acagcgccgt gagcctgttc 2220 
accgaccgca tcatcgaggt ggcccagcgc 2280 
cgcatccgcc agggcttcga gcgcgccctg 2340 

2352 
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